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These experiments are integrated into a con- 
sistent conceptual scheme of personality, des- 
cription and development, covering neurotic 
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the principles of dimensional analysis as a 
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tion of the findings to a variety of practical 
problems. Among the fundamental problems 
submitted to experiment are the relation be- 
tween neurosis and psychosis, the inheritance of 
emotional instability, the relative usefulness of 
‘atomistic’ and ‘gestalt’ methods of approach, 
the verification of typological schemes, such 
as those of Kretschmer and Jung, the possi- 
bility of exact measurement of dimensions of 
personality, and the validity and reliability 
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If we take in our hand any volume, let us ask, 
Does it contain any abstract reasoning concern- 
ing quantity or number? No. Does it contain 
any experimental reasoning concerning matter 
of fact and existence? No. Commit it then to 


the flames; for it can contain nothing but. 


sophistry and illusion.—Hume. 
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N psychology, as in other sciences, knowledge-Has-been ad- 
LS by new techniques and new instruments, as well as by 

new modes of thought, for ‘the bare hand and the understand- 
ing left to itself have but little power; results are produced by 
instruments and helps’. Whether the methods so cogently set out 
in this book will have great potency in the study of human person- 
ality is yet uncertain: the subject is elusive and the past failures, as 
the author points out, many. But Dr. Eysenck’s enterprise has now 
come a long way, in a rather short time, and even those who dis- 
trust statistical procedures and statisticians’ reasoning, or who find 
some of Dr. Eysenck’s conclusions at odds with. their experience, 
will concede his two major claims: he has used methods of observa- 
tion and analysis which are well defined and easily reproduced by 
other investigators, who can thus test the correctness of his find- 
ings; and there is immense gain to be had from a taxonomy which 
permits the cardinal attributes of each type to be measured, 
whether the types are of mental disease, of personality, or of 
general constitution. The more accurate the measurement and 
the more distinct the type, the greater the security which such a 


-Classification affords for further scientific inquiry. There are per- 


haps other advantages. If the types discerned show any constant 
relation with the course and pathology of disease, then they will 
be speedily enthroned, because of their value in clinical practice. 
But it is not on this ground that Dr. Eysenck takes his stand; he 
emphasizes rather the scientific and heuristic value of his research. 
He takes no obvious steps in his book to reconcile any complacent 
psychiatrists to what might seem a root and branch attack upon 
their beliefs and methods. Psychiatrists, however, are not com- 
placent unless they are stupid; and Dr. Eysenck is by no means 
blind to the worth of clinical knowledge: clinical categories pro- 
vide the indispensable starting-point for his investigation. His 
argument for an oscillating use, turn and turn about, of criterion 
and factor, clinical assessment and distillate of test-results, com- 
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mends itself to the psychiatrist, well aware that current methods of 
diagnosis and appraisal, though less shaky than Dr. Eysenck be- 
lieves, have signal weaknesses; they are the weaknesses to which 
Jaspers drew critical attention thirty years ago and which were 
openly reckoned with in the last edition of Kraepelin’s great 
textbook. 

The debt of psychiatry to the 
growing. Fortunately the indebt 
association between the two field 
mate. From these rigorous inqui 
developed over years, psychiat 
accuracy in some essential m 
reinforce the free play of clinic 


psychologist is now great and 

edness seems mutual, and the 
s of study most profitably inti- 
ries, sustained and resourcefully 
ry stands to gain an impetus and 
atters which will advance it and 
al skill and insight. 
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INTRODUCTION 


the results of experiments carried out at the Institute of 

Psychiatry (Maudsley and Bethlem Royal Hospitals) in the 
University of London. Where the other book dealt almost exclu- 
sively with adult neurotics, we have now extended our scope, 
and the present report deals with normal and psychotic adults, as 
well as with neurotics, and with children as well as with adults. 
The aim has remained the same, namely ‘to discover the main 
dimensions of personality, and to define them operationally, i.e. 
by meang of strictly experimental, quantitative procedures’. An 
attempt has been made to advance more rigorous proofs for the 
heuristic hypotheses outlined in the previous work, and to make 
deductions, and test these hypotheses, in settings other than a 
mental hospital—among factory workers, nurses, students,teachers, 
and other normal groups. An attempt has also been made to 
indicate the practical usefulness of the methods outlined by employ- 
ing them in studies of the effectiveness of occupational selection, 
both in the normal and the abnormal field, of the effectiveness of 
surgical operations in the alleviation of mental illness, and of the 
factors determining vocational adjustment. Throughout the book 
methods of statistical analysis and proof have been used which 
follow the time-honoured hypothetico-deductive method ; this 
-has occasionally necessitated the introduction of novel techniques, 
such as the method of criterion analysis. At all times, an attempt 
has been made to imbricate closely the psychological theory which 
underlies this work and the experimental procedures adopted. 
Finally, I have tried to eschew that meretricious sesquipedalian- 
ism which threatens to turn psychology from a scientific disci- 
pline into a stamping-ground for experts in semantics and verbal 
magic. 

Much of the material here reported was originally delivered in 
the form of a series of open lectures at the University of Pennsyl- 
vania during my tenure there of a Visiting Professorship. Other 
parts of the book were delivered as lectures at the E.T.S. Invita- 
tional Conference in New York, at the 1950 Psychiatric Congress 
in Copenhagen, at a Conference on ‘Current Trends in Psychology” 

S.S.P. I 


Tis book is a sequel to Dimensions of Personality, and reports 
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in Pittsburgh, at the International Congress of Anthropology in 
Paris, at the 1949 A.P.A. meeting in Denver, at the International 
Psychological Congresses in Edinburgh and Stockholm, at various 
meetings of the B.P.S., and at a number of American Universities 
from Canada to the South, and from California to New England, 
all of whom kindly gave me facilities to put before their students 
certain aspects of the methods used and the results achieved. The 
response of these very varied audiences, ranging from lay through 
undergraduate to graduate and professional levels, and having 
orientations ranging from the clinical and psychiatric to the 
mathematical and statistical, has taught me a great deal about 
possible lines of criticism and areas of difficulty. But above all they 
have taught me that an essential part of a book of this kind must 
be a clear statement of the aim and purpose which has motivated 
the investigator. It soon became clear to me that many who on the 
whole were sympathetic to the general tenor of the work were a 
little puzzled as to its main import. This difficulty of ‘placing’ was 
so universal that I have ventured to include an opening chapter in 
which I have tried to relate the accepted principles of scientific 
methodology to the experimental study of personality, and to 
indicate why in my opinion the areas investigated are important, 
and why the method used was thought superior to others more 
familiar. This was considered all the more vital in view of the 
wide divergence of aims which is so characteristic of the many 
different types of professional workers who are interested in human 
personality. 

Several convictions which I held before work was started on 
the research reported here were strongly reinforced during its 
progress. The first is that in the field of personality study, close 
co-operation between psychologists and psychiatrists is not only 
desirable but may be considered essential. The psychologist may 
rightly consider much of what the psychiatrist says subjective and 
intuitive rather than exact and objective; that does not necessarily 
mean that it is wrong. He would be foolish to throw away the 
accumulated experience of centuries in the vain hope of starting 
with a tabula rasa. Regarded as a source of fruitful hypotheses, 
rather than as a store-house of revealed truth, psychiatric know- 
ledge may save the psychologist much fruitless burrowing and 
searching, and lead him quickly to those areas most germane to 
his objective. The psychiatrist may rightly consider much of what 
the psychologist does as arid and unrevealing; to condemn 1t all 
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may be a short-sighted attitude. In science, coming events do not 
always cast their shadows before them, and the growth of a 
genuinely scientific discipline of personality study may be the out- 
come of much that appears at first sight unpromising and irrelevant. 
Both sides have much to learn from each other, and our knowledge 
of human nature will advance more securely and more quickly by 
a pooling:of knowledge, experience, and methodology. 

'The second conviction, which lay already at the basis of the 
work reported in Dimensions of Personality, is that research in this 
area, in order to be fruitful, has to be what Marquis (1948) has 
called ‘programme research’. In other words, research has to be 
planned for a whole department around a common objective, it 
has to be continued over a period of years, and the research 
activities of students and teachers alike have to be integrated into 
it. Only in this way can one hope to avoid the inconclusive out- 
come which all too often characterizes the individual short-term 
research. 

Individual students, of course, must be free to choose whether 
they want to take part in the departmental research, or whether 
they want to strike out for themselves; it is our experience that the 
majority welcome the opportunity of making an active contribu- 
tion to the development of a programme which shows promise of 
furthering our knowledge of human nature. Such training will also 
prepare the student for the type of team research which is becoming 
more and more prevalent in the social field, where specialists in 
many disciplines must collaborate in order to solve a problem 
common to them all. 
^ The third conviction, elaborated in various places throughout 
this book, relates to the use of different types of personality test. 
While questionnaires and so-called ‘projective’ tests constitute 
almost the only types of test used by clinical psychologists and 
research workers, the investigations described in Dimensions of 
Personality and here have more and more confirmed my belief that 
progress in the scientific study of personality is intimately bound 
up with what may be called ‘objective behaviour tests’ (Eysenck, 
1950a). These appear to tap more fundamental layers of personality 
than do other types of test. If we deduce personality from behaviour 
(as surely we must if we want to stay on solid ground) then the 
argument from one item of behaviour to another seems more 
cogent than the argument from what a person says about his 
behaviour, or the inference from what he says about a picture or 
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an ink blot, to what he actually does. This belief, of course, is 
subject to empirical verification; it is given here as an heuristic 
hypothesis, not as a statement of fact. 

My last conviction relates to the importance of taxonomic 
work, of finding the dimensions along which we want to measure 
before we proceed to carry out any measurement. At the present 
moment, psychology and psychiatry are still in the pre-classifica- 
tory stage—although some optimists appear to think that both 
disciplines have already ‘outgrown’ this stage. Attendance at 
psychiatric case conferences inevitably calls to mind Punch’s com- 
ment on the new freight charge system introduced by British rail- 
ways around the middle of the last century, particularly in its 
application to livestock. An old lady is shown, carrying a tortoise in 
а case, whilst a porter, scratching his head to fit this exotic animal 
into the system of classification laid down by the Company, 
mumbles to himself: ‘Cats is dawgs and pigs is dawgs, but this "ere 
hanimal is a bloomin’ hinsect.’ I may be unduly sceptical about 
the systems of classification now in use, but they seem to me prac- 
tically of little value, theoretically indefensible, and internally 
contradictory. Yet, for reasons to be explained in some detail later 
in the book, I cannot agree with those who would do away with 
any kind of taxonomic system either. As has been said of Christ- 
lanity, a proper system of diagnosis has not been tried and found 
wanting; it has been found difficult to construct and therefore 
never tried. Surely our task should be that of patiently construct- 
ing such a system by using the well-known principles of scientific 
method; cut away ruthlessly those existing concepts and proce- 
dures which lack reliability and validity, and treasure those which’ 
stand up to the rigours of the hypothetic-deductive method. It is as 
a report on the first stages of such an endeavour that these pages 
should be read. If the results are less than startling, they are at 
least based on experiment and deduction; if the methods are less 
than sound, they are at least subject to disproof and discussion: if 
the conceptual framework is less than adequate, at least it ic cap- " 
able of growth and change. If this book has no other effect than to 
spur on others to devise better methods, achieve results of greater 
value, and arrive at more widely relevant concepts, I shall be 
amply repaid for my labour in writing it. 

One of the pleasures attending the completion of a communal 
effort such as the work reported in this book, is the opportunity of 
expressing one's thanks to those whose help contributed so largely 
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towards whatever success it may be deemed to have achieved. 
This group includes present and former colleagues, research assist- 
ants, students, and members of the clerical and computing staffs, 
as well as numerous friends here and in the United States whose 
criticism and advice helped me greatly in formulating problems 
and research designs more clearly. Special mention should be 
made of the work done by A. Petrie on the after-effects of leuco- 
tomy and on selection problems; by H. T. Himmelweit and 
A. Petrie on student selection and the measurement of neuroticism 
in children; by N. O’Connor and J. Tizard on the employability 
of mental defectives; by F. Goldmann-Eisler on the etiology of the 
‘oral’ type; by F. Loos on the relation of ‘sense of humour’ to 
neuroticism; by G. Granger on the applicability of atomistic and 
gestalt laws to preference judgments; by A. Clark and A. Gravely 
(now Mrs. Clark) on perceptual and motor tests as measures of 
personality; by S. Crown on after-effects of leucotomy, and on the 
Word Connection List as a measure of personality; by D. Prell on 
the inheritance of neuroticism; by M. Israel and D. Vinson on the 
subjective versus the objective method of Rorschach interpreta- 
tion; by A. Heron on the objective measurement of personality in 
unskilled workers, as related to work adjustment and productivity; 
by A. Meadows on the reliability of the Rorschach test; by S. Cox 
on the validity of the Rorschach test; by A. Lubin on the applica- 
tion of discriminant function techniques to psychological problems; 
by J. May on the assumptions underlying Holzinger's h? statistic; 
by F. Freeman on the Character Interpretation Test, and by 
G. Yukviss and M. Malloy who carried out the testing for various 
researches. 

Computations were carried out by the statistical section of the 
Psychological Department, under the direction of A. Lubin, J. 
May, and, for about one year, A. Jonckheere. The computors who 
carried the main burden were D. Burgess, I. Wiltshire, and J. 
Singleton; the latter also bore the main responsibility for the 
Hollevith work. 

D. Furneaux contributed greatly by his advice on apparatus, 
and thanks are due to him and to J. Westhenry and W. Whithers, 
workshop technicians, on whom fell the burden of constructing 
and maintaining apparatus. J. Standen was also helpful in this 
connection. 

S. Francis, now Mrs. McIntosh, typed the manuscript, checked 
the proofs, and assumed responsibility for the bibliography; her 
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quick and accurate work considerably lightened the load of 
authorship. D. Webb drew the figures with skill and competence. 
F. Loos and S. Jeff took the photographs, which were posed by 
various members of the Psychological Department. 

Professor P. E. Vernon read the manuscript and saved the 
reader from a good measure of repetition and obscurity; the writer 
can hardly refrain from accepting full responsibility for what has 
remained of these twin faults. 

Professor Aubrey Lewis gave constant support to the research 
programme of the Department, and it is no exaggeration to say 
that without him this work could not have been completed. Many 
other colleagues and friends at the Bethlem and Maudsley Hos- 
pitals contributed of their time and advice, and lent their support 
to our plans; they are too numerous to name here. 

My wife contributed directly to this research by her great skill 
in testing difficult and sometimes hostile patients; this, however, is 
not the only reason why this book is most fittingly dedicated to her. 

H. J. EYSENCK 
Institute of Psychiatry 


Maudsley and Bethlem Royal Hospitals 
University of London 


Chapter One 


SCIENCE AND PERSONALITY 


Like other words having great prestige value it is used in 

many contexts where one may legitimately doubt whether 
those who use it mean by it what philosophers, logicians, and 
scientists themselves have agreed upon as the correct usage. 
‘Scientific palmistry’, ‘Christian science’, ‘scientific marriage 
counselling’, and ‘principles of scientific Marxism’ are catchwords 
which use an honoured term as a bait; their relation to science is 
at best tangential and usually non-existent. In the examples given, 
the deception will be obvious to most readers; in other cases we 
may not be so sure, and some kind of criterion may be required to 
judge accurately. If we look at the flood of articles dealing with 
‘personality’, ‘clinical psychology’, ‘counselling’, ‘psychoanalysis’, 
‘diagnostic testing’, ‘projective techniques’, and similar inter- 
related concepts, and attempt to judge their value from the point 
of view of science, we may be permitted at least a doubt as to 
whether all these contributions can be called genuinely ‘scientific’ 
—indeed, in a more jaundiced mood we may wonder if any of 
them deserve this title. 
' Much argument on this point has suffered from a lack of 
definition of the term ‘scientific’, both parties to the argument 
using this word in different senses. Finding their contributions 
unacceptable as scientific in the orthodox sense, many workers in 
the field of personality have surreptitiously altered the meaning of 
‘scientific’ in such a way that it still retains its prestige value, but 
is now emptied of precisely those elements which originally gave 
rise to this esteem. This procedure of changing the meaning of 
words is familiar in the political field, where words such as 
‘democracy’, which have relatively clear meaning and content, 
are used by some who wish to retain the prestige value of the term 
for régimes which clearly do not fall under this heading, to mean 
almost precisely the opposite to what the word has hitherto stood 
for. Semantic manceuvres of this type create much confusion and 


7 


Th word ‘science’ has great prestige value in our society. 


8 SCIENCE AND PERSONALITY 


necessitate a careful analysis of the meaning of the words used. 
They are illustrative of the Alice-in- Wonderland atmosphere 
which pervades so much of this field. ‘When I use a word’, said 
Humpty-Dumpty, ‘it means just what I choose it to mean— 
neither more nor less.’ “The question is,’ said Alice, ‘whether you 
can make words mean different things.’ ‘The question is,’ said 
Humpty-Dumpty, ‘which is to be Master—that’s all.’ 

There are clearly two kinds of psychology, just as there are two 
kinds of physics. Eddington (1928) has contrasted these two types 
of physics by writing about his two writing tables—the sensible 
table, as perceived by him, and as accepted by common sense, 
solid and impenetrable—and the table as modern physics sees it, 
being constituted mainly of empty space, with a very large number 
of very small electrons, protons, neutrons and other bodies and 
electric charges moving in this space. Absurd as this second con- 
ception of the table may sound to the naive observer, we are willing 
to accept it because by accepting it, and all the concepts and laws 
which are involved in it, we are able to bring order into the great 
mass of facts which the world presents to us, and because predic- 
tions made on this basis have a habit of being borne out when we 
apply an empirical check. Physics started from the actual table as 
presented to sense experience, but it has left it a very long way 
behind. In doing so, it has also lost a good deal of the feeling of 
certainty which attaches to comm 
it by an attitude of tentative be 
current hypotheses. 


In a similar way, psychology starts out with the observations 
of common sense. In certain limited fields, it has succeeded in 


1 This opposition between science and comm 
Galileo in his famous Dialogues (1661): 

‘I cannot sufficiently admire the eminence of those men’s wits, that have 
received and held it to be true, and with the sprightliness of their judgments 
offered such violence to their own senses, as that they have been able to prefer 
that which their reason dictated to them, to that which sensible experience 
represented most manifestly to the contrary. I cannot find any bounds for my 
admiration, how that reason was able in Aristarchus and Copernicus, to 
commit such a rape on their senses, as in despite thereof to make herself mistress 
of their credulity.’ 

Einstein recently emphasized a similar point: 

‘Advances in scientific knowledge must bring about the result that an 
increase in formal simplicity can only be won at the cost of an increased 


distance or gap between the fundamental hypotheses of the theory on the one 
hand, and the directly observed facts on the other.’ 


on-sense observation, replacing 
lief in the relative accuracy of 


on sense is well brought out by 
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getting beyond common sense, and in elaborating concepts and 
laws which bring order into a relatively small field and which 
enable us to make moderately accurate predictions. In other fields, 
it is still painfully close to common sense and has failed signally to 
elaborate concepts and laws of its own. In most fields, it shows a 
stubborn tendency to cling to feelings of certainty and to refuse to 
hold tentative beliefs, and to regard current hypotheses as at best 
very rough-and-ready approximations to those likely in the long 
run to be found most useful. 

In contrasting common-sense observation and scientific 
theory, we do not mean to imply that common sense is invariably 
wrong and science invariably right. The difference lies essentially 
in the method by means of which one's opinion is arrived at. If this 
method follows the dictates of science, the result may be wrong, 
but it will be a wrongness that is self-correcting. If the method is 
not that of science the result may be right, but we have no way of 
knowing that it is right. If it is wrong, we have no way of correcting 
it. Many firmly held beliefs of common sense are wrong, although 
they may superficially have much to recommend them. Others 
may be right or wrong; we have no evidence to decide. 

The two kinds of psychology which we have distinguished— 

that of common sense and that of science—do not differ only from 
the point of view of method; they also seem to differ from the point 
of view of aim. Common-sense psychology seeks to understand. If 
another person acts in the same way that I would act, then in some 
way I may say that I understand his actions. If he acts in a way 
different to the way in which I would act, but nevertheless 
familiar to me because I have seen many others behave in this 
fashion, I still feel that I understand him. If his action is un- 
familiar to me, I may try to understand it by reducing it to familiar 
motives experienced in unusual circumstances. Much of psychiatry 
consists in such an attempt at understanding, and it is often stated 
that the good psychiatrist or psychologist must possess ‘empathy’ 
which enables him to ‘feel himself into’ his patient. 

The aim of science is quite different, and rather more austere. 
Mach (1942), Ostwald (1911), Pearson (1911), and many other 
Scientists have maintained that the only real world is a sensible 
world, and that scientific theories are merely descriptions of the 
sensible world. In other words, science aims at description. This 
dictum is easily misunderstood. *A complete description of natural 
phenomena is impossible, and if it were not impossible it would be 
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useless, for the aim of science is to make the primary data in- 
telligible by exhibiting their mode of connection. Without abstrac- 
tion this would be impossible but a complete description would not 
permit of abstraction' (Stebbing, 1930). Obviously, then, descrip- 
tion may be understood in two senses. We may refer to description 
at an elementary level, as when we describe minutely the charac- 
teristics of a given table, or a blade of grass, or a whale. Or we 
may refer to description at a higher level, as when we describe the 
movements of the planets in terms of parabolas, or the behaviour 
of electric particles in terms of field theory. This latter type of 
description is often called ‘explanation’, because by referring the 
more elementary, individual phenomena observed at the low level 
of description to the laws which are abstracted from these ele- 
mentary facts, and which constitute the higher level of description, 
we often feel that we have ‘explained’ the observed facts. All we 
have done, of course, is to give the individual fact a place in a 
unified, consistent system of description; more than this science 
does not attempt to do. If it is clearly understood that the term 
‘explanation’ does not carry any overtones of intuitive or empathic 
understanding, no anthropomorphic ‘feeling oneself into’ things, 
but stands merely for the more abstract level of description, there 
is probably no great danger in using that term, and it will be so 
used in the remainder of this chapter. 

Science, then, tries to describe and to explain. In doing so it 
follows certain rules which experience has shown to be indispen- 
sable to the development of descriptions so comprehensive as to 
deserve the name ‘explanation’. Let us start with an example con- 
trasting non-scientific and scientific explanation, using planetary 
motion as the problem to be explained. ‘The ancient Egyptians, 
starting from the assumption that the universe is like a large box 
of which the earth forms the bottom and the sky the top, supposed 
the stars to be lamps either carried by gods or suspended by cords 
from the top of the box. The sun was supposed to be a god, Ra, 
carried daily in a boat along a river of which the Nile was a branch, 
He was born every morning, grew in strength until noon, when he 
turned into another boat. Finally, he was carried in another boat 
during the night back to the east. The eclipse of the sun was 
accounted for by the supposition that a huge serpent sometimes 
attacked the boat. A similar supposition accounted for lunar 
eclipses and the phases of the moon. Like the sun, the moon had 
its enemies. A sow attacks it on the fifteenth day of each month 
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and after a fortnight’s agony and increasing pallor, the moon dies 
and is born again. Sometimes the sow manages to swallow it 
altogether for a time, causing a lunar eclipse. 

‘Although these suppositions do not involve flagrant contradic- 
tion of the facts upon which they are based and although they are 
the outcome of a desire to explain these facts, nevertheless, this 
Egyptian.doctrine is in no sense scientific. Its failure to be scientific 
is not due to the fact that it rests upon unproved assumptions. 
Every scientific theory rests ultimately on such assumptions. It 
fails because these assumptions were of such a kind that there could 
be no further evidence in support of them. They were essentially 
unverifiable; they were not susceptible of development; they did 
not suggest the deduction of observable data. ... The Egyptian 
theory, therefore, could neither be developed nor tested. But a 
theory is scientific only if it admits of testing and development 
(Dreyer, 1906). 

As an example of scientific explanation, we may take Newton's 
law of gravitation. Included in his descriptive scheme are all the 
facts which enter into Ptolemy's mathematical description, into 
Copernicus’ heliocentric scheme, and into Kepler's three laws of 
planetary motion. All these facts were not merely fitted. into 
Newton's system; they could be deduced from it. And once the law 
had been enunciated, such apparently independent and discon- 
nected facts as the phenomenon of the tides, planetary motions, 
the precession of the equinoxes, the phenomenon of weight, the 
motions of cyclones, and the orbit of the comet could all be ex- 
plained in terms of the inverse square law of universal attraction. 
* “Such deductive power is a mark of constructive description. 
. .. By its comprehensiveness, by its deliberateness in leading to 
new discoveries, in suggesting fresh experiences, and in connecting 
what was hitherto disconnected, it achieves the aim of science in 
making the multiplicity of facts intelligible. Such a theory may be 
said to explain the sensible facts that constitute its data since it is 
not only based upon them but leads deductively back to them. 


Further . . . it leads on to other sensible facts which were not 
known and thus do not form part of the original data' (Stebbing, 
1930). 


The greater the degree of co-ordination achieved by a theory, 
the more it is liable to be upset by a small discrepancy. A construc- 
tive description, such as Newton’s theory, possesses a high degree 
of co-ordination; it is coherent with respect to a comprehensive 
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range of facts. A single fact which is in contradiction to the 
hypothesis may necessitate its rejection. Occasionally the theory 
may be salvaged through the use of an ad hoc hypothesis which 
accounts for the discrepant fact, and which can later on be verified. 
Thus, for instance, an irregularity in the orbit of the planet 
Uranus was accounted for by the ad hoc hypothesis that another, 
hitherto unknown planet, was responsible for these aberrations. 
This planet had to have certain qualities which could be specified 
precisely in accordance with Newton’s laws, and when its position 
had been calculated on the basis of the theory, Neptune was 
actually discovered in the precise place required according to 
theory. Another irregularity in planetary motion, namely, that in 
the orbit of Mercury, was also explained by le Verrier on the basis 
of a similar ad hoc hypothesis, involving an interior planet which 
he called Vulcan. Careful observation failed to reveal any such 
planet, and the hypothesis was discredited. 

‘A multiplication of ad hoc hypotheses is contrary to the aim of 
science which seeks the greatest possible co-ordination of facts. 
Every additional ad hoc hypothesis marks a breakdown of the co- 
ordination. . . . An ad hoc hypothesis necessarily explains a given 
discrepant fact since it has been introduced solely for that purpose. 
If we can appeal to an ad hoc hypothesis whenever there is a 
discrepancy between theory and fact, then we can explain every- 
thing and foretell nothing. It is only permissible to introduce an 
ad hoc hypothesis when consequences can be deduced from it which 
are later verified. Otherwise an alternative method for dealing 
with a fact discrepant with a theory has to be resorted to, namely, 
the rejection of the fundamental assumptions upon which the 
theory is based’ (Stebbing, 1930). 

This is the course followed by Einstein in his attempts to 
account for the negative result of the Michelson-Morley experi- 
ment to detect the earth’s velocity relative to the ether. The experi- 
ment showed that contrary to Newton's laws the motion of the 
earth has no influence upon the propagation of light over the 
earth's surface. While this discrepancy is minute, it nevertheless 
entails either a multiplication of ad hoc hypotheses, or a rejection 
of fundamental assumptions. In spite of the success and immense 
prestige of Newton's constructive description it had to be rejected 
under the pressure of experimental fact unless it were saved by the 
introduction of ad hoc hypotheses. When, as in the case of the 
Neptune hypothesis, the ad hoc hypothesis is verifiable, and sug- 
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gests an explanation of the discrepancy which is of a kind already 
familiar, then the hypothesis is not objectionable. It does not con- 
flict with the aim of science. When, however, it is essentially un- 
verifiable since it suggests what transcends observation, or appeals 
to effects which neutralize each other (as was the case with the 
Lorentz-Fitzgerald contraction suggested to account for the 
Michelson-Morley experiment), then it is not admissible. 

It is an interesting aspect of many modern theories of person- 
ality (particularly the analytic and so-called dynamic ones) that 
they capitalize precisely on these two inadmissible and unverifiable 
types of ad hoc hypotheses. The concept of reaction-formation, for 
instance, implies that cause A may result in behaviour-pattern Х 
as easily as in non-X—and presumably anything in between. One 
dynamic may cancel out another. Prediction is thus impossible, 
and indeed hardly ever attempted. ‘Verification’ consists not in 
stating an hypothesis, making deductions, and then seeking 
evidence regarding these deductions; it consists in first obtaining 
the evidence and then seeking an explanation, which is invariably 
found through the invocation of unverifiable ad hoc hypotheses. It 
can hardly be argued that this type of procedure is in line with the 
precepts of science, as ordinarily understood. 

To quote but one example from the research literature to 
illustrate this tendency, we may take Symonds’ book on Adolescent 
Fantasy (1949). In it he gives a set of correlations between teachers’ 
ratings of forty children studied in great detail, and their ‘fantasy 
themes’ as determined from the Thematic Apperception test. One 
hundred and seventy-five such correlations are given on page 368 
bf the book; they have been plotted in Figure 1 below. It will be 
seen that they cluster around zero, and that in view of the high 
sampling errors involved in the calculation of correlations from 
small numbers of cases these figures do not in any way deviate 
from a kind of distribution expected on the basis of the null 
hypothesis, i.e. on the basis of the hypothesis that there is no 
correlation whatever between rating and T.A.T. score. This is the 
only conclusion which can legitimately be drawn from these data, 
and in view of the far-reaching claims which have been made for 
this ‘test’, it is one which is of great interest and significance. 

It is impossible to give an accurate estimate of the divergence 
of the empirical figures from chance expectation. The S.E. when 
r = o is -16 for forty cases, so that one would expect 68 per cent 
of all observed correlations to lie within the limits of + :16 and 


14 SCIENCE AND PERSONALITY 


— -16. Actually the number of these effectively zero correlations 
is slightly larger than this. so that if anything the ratings and the 
T.A.T. scores are correlated to an even slighter extent than would 
be expected on a purely chance basis. However, the correlations 
are not independent from each other, so that no exact test can be 
made; such tests apply only to correlations which are themselves 
uncorrelated. This lack of independence also invalidates Symonds’ 
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Figure 1,—Distribution of correlations between Thematic Appreciation 
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argument that ‘there is a certain consistency in these relationships 
... that makes one believe that they have a higher degree of 
validity than the statistical probability of departing from chance 
would permit one to assume. Variables of opposite meaning will 
have relationships on opposite sides of the correlation zero point.’ 
Such consistency is in no way proof of the extra-chance character 
of sets of correlations; it can be deduced from the laws of prob- 
ability directly. 

This brings us to the last argument advanced by Symonds. 
‘These consistencies and regularities . . . fit in too well with the 
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meaning of the data to attribute them entirely to chance.’ They 
are ‘verified by their dynamic significance’. As an example 
Symonds quotes the fact that ratings of“ good adjustment’ correlate 
positively with fantasies of anxiety. This is interpreted as meaning 
that ‘the good adjustment was a reaction formation which kept the 
individual from recognizing his anxicty and gave him a way of 
defending, himself against it’. Those with poor adjustment, it is 
claimed, ‘did not show anxiety in their stories, because the anxiety 
expressed itself symptomatically in their poor adjustment to life’. 

Here we have the full flavour of the type of argument char- 
acterized as unscientific in our preceding discussion. A correlation 
is found between ‘good adjustment’ and ‘fantasies of anxiety’ 
which is only just larger than its own standard error. If this correla- 
tion had been in the opposite direction, it might have been possible 
to argue that it was in the expected direction, and that there is 
nothing inherently absurd about maladjusted children having 
fantasies of anxiety. The fact that the correlation is in the wrong 
direction, however, does not worry the ‘dynamic’ psychologist; 
indeed, it is expressly cited because it is ‘verified’ by its ‘dynamic 
significance’. All this is accomplished through that ready hand- 
maiden, ‘reaction formation’. The author is not concerned about 
the fact that in other projection tests, such as the Sentence Com- 
pletion Test, there is a significant and negative correlation between 
adjustment and fantasies of anxiety; his ‘explanation’ is a purely 
ad hoc one which can of course be relied upon to ‘explain’ the facts 
which it was constructed specifically to account for. Nor is the 
author worried by the fact that he is forced to make assumptions 
of such magnitude as would make less ‘dynamic’ writers wince— 
assumptions, for instance, that good adjustment is a defence 
against anxiety! When it is realized that all this elaborate appar- 
atus is set in motion, and all these tremendous theories and 
assumptions put forward, to explain a correlation which is com- 
pletely insignificant, it will be apparent why such procedures are 
considered antagonistic to the spirit of science, and unlikely to 
advance our knowledge of human behaviour. 

When we try to apply these precepts to the study of personality, 
we find that this discipline has charted for itself a dangerous 
course, threatened by the Scylla of nebulous Freudian general- 
izing, and by the Charybdis of pointlessly amassing unrelated 
‘facts’. Thus we have two great groups of workers whose efforts 
bring extraordinarily little return for the labour expended: On the 
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one hand the Neo-Freudian, Crypto-Jungian group of ‘intuition- 
ists’, who scorn the pedestrian methods of scientific verification, 
who base their conclusions on very small biased samples, and 
whose alleged ‘facts’ are themselves only interpretations according 
to very dubious canons of evidence; on the other, the mole-like 
calculators of innumerable correlations between measures taken 
without any theory or hypothesis capable of proof or disproof, who 
mistake the collection of a thousand unrelated ‘facts’ for science, 
who seek to atone for the scientific barrenness of their results by 
stressing the mathematical beauty and purity of the methods used, 
and who seem incapable of seeing the wood for the trees. In other 
words, there appears to be a definite schizophrenic interlude in the 
development of a scientific psychology of personality; the building 
up of hypotheses and theories is divorced from the patient collec- 
tion of facts. ‘Dynamic’ psychology is almost entirely devoted to 
an exploitation of ad hoc hypotheses which are either unverifiable 
or appeal to effects which neutralize each other; ‘actuarial’ psycho- 
logy is almost entirely devoted to statistical analysis without 
benefit of theory or hypothesis. It is small wonder that neither 
separately nor conjointly do these two ‘schools’ succeed in con- 
structing a firm, scientific foundation on which to build. What is 
needed in psychology, as in any other science, is greater under- 
standing of, and more intensive use of, the hypothetic-deductive 
method, in which a clear, unambiguous hypothesis is stated, 
deductions, preferably of a quantitative kind, are made, and 
experiments performed to verify or disprove the hypothesis. 

We may summarize our discussion so far by quoting Stebbing 
(1930): 

‘Scientific thinking is controlled and directed thinking; it is 
essentially methodical. Controlled thinking, in so far as it is 
successful, issues in the organization of facts originally appre- 
hended as fragmentary, disconnected, or it may be discordant. 
Organization is achieved by the discovery of connections whereby 
one fact is related to another. The isolated facts, given as discon- 
nected, are fitted together into an orderly arrangement which 
yields what relatively to the starting point may be regarded as 
a whole, or system. 

‘We may say that a set of elements exhibits order and there- 
fore becomes a system when, given the properties of some of the 
members of the set, the properties of other members of the set, or 
at least of some of them, are thereby determined. This determina" 
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tion is due to the relation that orders the set; it is not a property of 
the elements regarded as a class. 

‘The relevance of the subject of order to systematic inquiry may 
be regarded as evident. Scientific thinking essentially consists in 
the organization, or co-ordination, of the facts with which it deals. 
A collection of disconnected facts does not constitute a science any 
more than a mob of men constitutes a college, or a heap of bricks 
a building. The sciences are concerned not with facts as such but 
with ordered facts. One main difference between the earlier and 
the later stages of science is to be found in the growing prominence 
of order. What is ultimately important for science is the type of 
order rather than the elements that are ordered. It is for this 
reason that the most highly developed science, theoretical physics, 
appears to be concerned with something far removed from what 
the plain man would regard as a fact. Science has its origin in the 
attempt to co-ordinate the facts of sensible experience but in the 
discovery of the appropriate type of order it passes to the considera- 
tion of a different order of facts. 

‘Ultimately, the explanation of facts is to be found in their 
organization into a system. Thus, a mere collection of observable 
facts does not suffice to constitute a science. There is required in 
addition a certain kind of attitude to the facts and a certain kind 
of predominantly logical method. The requirements of this method 
determine whether a given observable fact is a possible datum for 
science. The scientists, that is to say, select those facts that can be 
treated in accordance with this method. We have then, two char- 
acteristics that belong to science; the selection of a certain kind of 
facts and the use of a certain kind of method. The scientist is con- 
cerned with the correlations of sets of properties. Hence, a scientific 
proposition is ultimately of the form: Whatever has the property 
X has the property Y. A scientist reaches such a proposition from 
propositions of the form: This A, this B, this C, etc., has the pro- 
perties X and Y. That is to say, he generalizes from particular 
cases."The instrument by which the generalizations are obtained 
is scientific method. The scientist, then, considers particular facts 
only in order to obtain generalizations of increasing abstractness. 
All observation involves abstraction, that is, selection from some- 
thing that is also there to be observed. 

‘The method of science sometimes called the inductive method 
has four essential stages. The first of these is constituted by the 
awareness of a familiar complex situation in which some one fact 
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is apprehended as peculiar, and where the need is felt to account 
for this fact by connecting it with the total situation in which its 
occurrence ‘would not be unexpected. 

‘The second stage is the formation of an hypothesis which 
would connect the unexpected fact with other facts. 

‘Third comes the deductive development of the hypothesis, and 
fourth comes the testing of the consequences deduced from the 
hypothesis by appealing to observable facts. 

‘With regard to such an investigation we may be said to ex- 
plain a fact when we have shown that it is connected in an orderly 
manner with other facts that do not have to be explained.’ 

This general point of view is opposed by those who hold what 
is often called the idiographic as contrasted with the nomothetic 
point of view. These terms, coined by Windelband (1921) and 
introduced into Anglo-Saxon psychology by Allport (1937), refer 
to attempts to *understand some particular event in nature or 
society’ as opposed to the seeking of ‘general lawsand employing 
only those procedures admitted by the exact sciences’. The main 
battle-cry of the idiographically minded psychologist is that of 
‘uniqueness of the individual’. ‘One may say (with penetrating 
accuracy) that each personality is a law unto himself, meaning 
that cach single life, if fully understood, would reveal its own 
orderly and necessary process of growth. The course of each life 
is a lawful event, even though it is unlike all other of its class’ 
(Allport, 1937). 

The idiographic view has a certain appeal to most psycho- 
logists who have to deal with people because its main proposition 
is so obviously true. It is quite undeniably true that Professor 
Windelband is absolutely unique. So is my old shoe. Indeed, any 
existing object is unique in the sense that it is unlike any other 
object. This is true as much in the physical sciences as it is in the 
biological, sociological, and psychological sciences. So much, then, 
is agreed. Two points arise. 

The first point relates to the meaning of this uniqueness. To 
Allport, it appears to be some mystical quality, something sui 
generis, something ‘afar from the sphere of our sorrow’. To the 
scientist, the unique individual is simply the point of intersection of a 
number of quantitative variables. There are some 340,000 discrimin- 
able colour experiences, each of which is absolutely unique and 
distinguishable from any other. From the point of view of descrip- 
tive science, they can all be considered as points of intersection of 
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three quantitative variables, hue, tint, and chroma. A combination 
of perfectly general, descriptive variables is sufficient to allow any 
individual to be differentiated from any other, by specifying his 
position on each of these variables in a quantitative form. Many 
writers ‘seem unable to see that one individual can differ quantita- 
tively from another in many variables, common variables though 
they may be, and still have a unique personality’ (Guilford, 1936). 
Quite on the contrary, the very notion of ‘being different from’ 
implies at the same time the idea of direction and the idea of 
amount—in other words, the unique individuals cannot meaning- 
fully be said to be different from each other unless they are being 
compared along some quantitative variable. Uniqueness, there- 
fore, is not in any sense a concept antagonistic to science; it 
follows from the methods used in science to describe individual 
events in terms of common variables. 

The second point relates to the importance attributed to the 
unique. ‘Science is not interested in the unique event; the unique 
belongs to history, not to science.’ Yet clearly applied science must 
deal with the unique, with the individual. We cannot build bridges 
in general, but only a particular bridge; we cannot cure patients 
as such, but only a particular patient. In dealing with the indi- 
vidual case, we must make use of such general laws as may be 
known to apply; our success in dealing with the individual case 
will be determined by how much is known by science regarding 
the laws governing this case. It is for this reason that bridges are 
built with greater success than has hitherto attended our efforts 
at curing patients by psychotherapy; it is not that patients are 
‘unique’ and ‘individual’, and bridges not—a bridge is as unique 
and individual as a patient—but because the laws governing the 
behaviour of bridges are very much better understood than the 
laws governing the behaviour of patients. Here indeed we have the 
bést possible defence of current psychiatry against charges that its 
procedures are unscientific: as long as psychology fails to discover 
general laws applicable to the psychiatrist’s practical problems, 
the psychiatrist must use whatever methods common sense, 
practical experience, intuition and theory dictate. He cannot tell 
his patients to wait until the glorious sun of science comes forth to 
illuminate the scene; he has to deal with them here and now. As 
long as the psychiatrist does not claim to have solved the scientific 
problems of personality, the psychologist certainly has no right to 
blame him for making use of any and every possible weapon he 
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can lay his hands on, or for making up his own conceptual 
schemata. Nothing that could be called a science of personality 
exists at the moment, and psychiatrists can hardly be blamed for 
refusing to base their procedures on the mere hope of the future. 
Pragmatism is therefore currently the only method open to 
psychiatry and clinical psychology; it is perhaps unfortunate that 
so many practitioners, instead of keeping an open mind in the 
very undeveloped state of the subject at the moment, have instead 
fallen prey to a ‘premature crystallization of spurious orthodoxics'. 

Another shibboleth, second only to that of ‘uniqueness’, is that 
of the ‘total personality’. In a sense, again, what those who use this 
phrase mean is so obviously true that no one would dissent from 
their stress on studying personality in toto, as it were, and not hack 
it up into arbitrary, unconnected parts. It would not be true, 
however, to imagine that science holds a point of view opposed 
to such insistence. Even in physics, relativity theory lays great 
emphasis precisely on this ‘interconnectedness’ of all the phenom- 
ena which go to make up its field of study, and in psychology 
attacks on ‘atomists’ are largely attacks on men of straw, not on 
points of view held by any responsible psychologist. It is in the 
negative connotations of this phrase, rather than in the positive 
ones, that one may discern certain specious arguments. Thus 
believers in the ‘wholeness’ of personality often decry any type of 
analysis, and declare that analysis destroys what it seeks to study. 
Instead, they advocate methods and techniques which are said to 
‘study personality as a whole’. 

It is difficult to attach any more meaning to this phrase than 
to a claim to study ‘the universe as a whole’. Nor does an investiga- 
tion of the methods actually used by those who employ these 
phrases show anything very different from the ordinary procedures 
used for many years—unless it be a lack of reliability, disregard of 
validity, and a contempt for the patient work needed for verifying 
theories and hypotheses. When we look into a manual of the 
Rorschach Test (Klopfer & Kelly, 1942), for instance, the pre- 
ferred technique of the holistically-minded, we find there the claim 
that from the Rorschach record can be deduced: ‘(1) The degree 
and mode of control with which the subject tries to regulate his 
experiences and actions. (2) The responsiveness of his emotional 
energies to stimulation from outside and promptings from within. 
(3) His mental approach to given problems and situations. (4) 
His creative or imaginative capacities, and the use he makes of 
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them. (5) A general estimate of his кепес еуел cad the} NON 
qualitative features of his thinking. (6) A g&neral estimate of thes ` 
degree of security or anxiety, of balance n,general, and specific. 
unbalances. (7) The relative degree of maturity in the total per 
sonality development." The author is PLC out that this _ 


is not all that the Rorschach can tell us sgoes on to say. that 
‘this list does not represent a complete accoun oft thé personality 
aspects revealed by the Rorschach method’. Apparéntly*the test is 
also ‘remarkably effective in estimating the intellectual status of 
an individual; in revealing the richness or poverty of his psychic 
experience; in making known his present mood; and in showing 
the extent of his intuitive ability as well as in disclosing special 
talents and aptitudes’. Furthermore, ‘it detects anxieties, phobias, 
and sex disturbances, as well as more severe disorders, and serves 
as a guide for appropriate treatment’. 

According to this widely-accepted text, then, we are not deal- 
ing with the whole personality at all; we are dealing with creative 
and imaginative capacities, with intellectual level and status, with 
security, anxiety, balance and unbalance, with moods and 
intuitive abilities, with phobias, sex disorders, and other disturb- 
ances. It is not easy to see, then, how this treatment differs from 
the traditional analysis of personality into cognitive, conative, and 
affective areas, into abilities, temperament, character, intelligence, 
traits, and so forth. Obviously the total personality has to be 
analysed into smaller units before we can study it at all; we are 
thrown back to the question of what is the best method of analysis, 
and how we can determine its validity. There is no book on per- 
sonality, and no test or technique, which deals with the unique, 
total personality; these words are merely used as camouflage to 
screen the absence of acceptable validity and reliability. 

What has been said here of the Rorschach test applies with 
equal force to the psychoanalytic technique. Here also we do not 
encounter a unique, total personality; we find Oedipus complexes 
and super egos, regressions and transferences, cathexes and 
libidos, fixations, symbolisms, compensations, catharses, Narcis- 
sisms, ‘and many other strange entities jostling each other; all 
of them attempts to analyse what is elsewhere declared to be 
*unanalysable'. 

Surely this whole argument rests on a misunderstanding. What 
we are dealing with, in psychology as well as in physics, are certain 
entities, parts, ‘sub- wholes’, or whatever they may be called, whi 
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stand in certain relations to each other. It is vitally important to 
identify the correct parts or sub-wholes; it is equally vital to dis- 
cover the relations obtaining between them. It is absurd to call 
one of these tasks more important than the other; they are 
mutually interdependent, and progress in science is impossible 
without paying due regard to both. To say that orthodox psycho- 
logy is ‘atomistic’, and is interested only in the parts, to the neglect 
of the relations between them, is manifestly untrue; techniques of 
curvilinear regression, of curve-fitting, and of trend analysis show 
clearly the concern statistical psychologists feel about the exact 
form of the relation obtaining between the various entities dealt 
with. What is important is not whether one party or the other to 
this dispute is right; the main consideration must be that each side 
should put its point of view in such a way that an empirical test, 
a crucial experiment, becomes possible. The question of the rela- 
tion between the variables found useful in psychology must be an 
experimental problem; it cannot be solved on the philosophical 
level. 

Sometimes this argument is put in a rather different form. It 
is said that the human mind is able to seize upon, and to co- 
ordinate, large numbers of small impressions (‘petites perceptions’, 
if the historical parallel be permitted), while the calculating 
machine cannot as fruitfully deal with such large numbers of 
determinants, all of them influencing each other in complex ways. 
Such an argument in favour of interviewing, of intuitive inter- 
pretation of projective productions, and similar subjective pro- 
cedures, is not on the face of it unreasonable, although it may not 
appeal to the tough-minded scientist. It would appear to be a 
perfectly genuine hypothesis with much prima Jacie validity; un- 
fortunately this common-sense appeal of the hypothesis has 
blinded many practitioners to the fact that there is little evidence 
in favour, and much against, this belief in the integrating power 
of the mind. I shall leave out of account the mathematical proof 
that given a number of facts (however large) and their, inter- 
connections and relations with a criterion (however complex), it 
can be shown that there is one best combination of these facts 
which gives the highest possible predictive accuracy, and that this 
best combination can be reached by orthodox statistical methods 
of multiple correlation; the intuitive brain being at best able to 
equal, but not to excel this prediction. I shall instead mention 
briefly three experimental studies which may cause the believer in 
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subjective methods at least to suspend judgment until he can quote 
equally impressive evidence in his favour. 

First we may quote the conclusion arrived at by the Bureau 
of Naval Personnel, in their large-scale studies of prediction of 
success at training schools (Stuit, 1947). “The improvement in 
predicting school success by having in addition to test scores an 
interviewer’s evaluation of experience, interest, and personality, 
is relatively small and may well be negative.’ The actual figures 
quoted show that in many cases the addition of the interview to 
the straightforward prediction based on test scores alone lowers 
predictive accuracy; in no case is there a striking improvement. 
In view of the very large numbers involved, 37,862 trainees having 
been studied altogether, this conclusion and the data on which it is 
based deserve most careful study. 

Another study giving similar results, although in a different 
sphere, is the experiment on student selection conducted by 
Himmelweit (1950, 1951). One of her aims was the comparison 
of existing methods of selection with objective test techniques. (A 
more detailed report of her results with objective tests of person- 
ality will be found in a later chapter.) One of the traditional 
techniques much valued by the University was the interview. “The 
main object of the interview is to assess the candidate’s suitability 
to pursue a course of study, special consideration being given to the 
following factors: (a) general intelligence. (b) Previous education, 
training and experience. (c) Interests and motivation. (d) Person- 
ality and character.' None of the tests of intelligence and tempera- 
ment which proved to be successful predictors of student success 
correlated significantly with the interview; indeed, there was a 
tendency for low negative correlation coefficients to predominate. 
When correlated with success at University the interview again 
failed to show a significant correlation; its predictive accuracy was 
indicated by a correlation coefficient of -067! In view of the rela- 
tively high correlations achieved by objective tests, we find again 
that belief in the integrating faculties of the human mind is not 
supported by experimental findings. 

Probably the most impressive, as well as the most conclusive, 
study, however, is the experiment carried out by Kelly (1949) on 
the selection of clinical psychology students. Using a plethora of 
different methods—interviews, ratings, objective tests, projective 
tests, intelligence tests, sociometric techniques, interest and atti- 
tude schedules and questionnaires, and following the students up 
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over several years, he concluded that ‘the most valid individual 
clinical judgments tend to be those based on relatively incomplete 
data regarding the assessee. In general, clinicians seem to make 
their best predictions on the basis of materials contained in the 
credential file and an objective test profile. Subsequent predictions 
based on increasing amounts of data, e.g. the addition of the auto- 
biography, the projective tests, the interview, and the situation 
tests seem to have resulted in successive decrements in validity, as 
each additional type of data was added to the picture. This finding 
appears all the more significant when it is remembered that 
assessment staffs have uniformly been of the opinion that the inter- 
view contributed relatively most to their understanding of the case.’ 

While the report from which this summary is quoted is only a 
preliminary one, its conclusion shows a striking agreement with 
one of the most interesting results of the O.S.S. assessment pro- 
gramme (1948). Using in the main two selection stations for 
recruits, S and W, of which the former devoted roughly three 
times as much time to each recruit as did the latter, with all the 
increase in information, testing, interviewing, and conferences that 
that implies, the authors of the report found that ‘the job ratings 
given after the one-day assessment at Station W were generally 
more valid than those given after the three-day assessment at 
Station S’. The respective figures, giving correlations between 
rating and criterion, are :37 and -53, using the criterion which the 
writers themselves regarded as the most valid. The explanation 
that perhaps the candidates at these two stations differed in such 
a way as to account for the result will not hold; ‘it is not possible 
to say that the candidates who went to W were any easier to 
assess’. The writers conclude: ‘It would be profitable, in the long 
run, for us to assume that the additional information obtained 
by stretching the'screening process from one to three days had 
diminished the validity of the final decision.’ 

The evidence from all these different writers, with their vary- 
ing biases and divergent outlooks, appears to agree that the human 
brain is not a very good instrument for assessing, weighting, and 
combining many items of information in such a way as to make 
valid predictions; on the contrary, it is easily confused when pre- 
sented with more than a very few such items, and added тишче 
tion tends to lower, rather than to raise, its forecasting efficiency. 
The evidence is perhaps not conclusive, but it is remarkably 
unanimous. 
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The history of science thus gives us guidance as to the method- 
ology to be employed in scientific work; does it also help us in 
deciding which of the many pressing problems to attack first? 
There appears to be one type of problem which is fundamental to 
all progress in the study of personality, and which must find at 
least a preliminary answer before we can hope to advance in any 
direction at all. This problem is that of taxonomy or classification, 
and it can best be introduced by using a simple example of how 
other problems tend to lead back to it. 

It is believed nearly unanimously by psychiatrists, psycho- 
analysts, clinical psychologists, counsellors, and the lay public 
that psychotherapy has the effect of alleviating, partly or wholly, 
the illnesses of the neurotic. A look at the evidence for this belief 
will show that it is a common-sense belief, not one which could be 
considered scientific. This does not mean that the belief is wrong; 
it merely means that no scientific evidence has been produced to 
justify it. All we have are statements of subjective feelings by 
psychotherapists regarding the outcome of their labours; it must 
be obvious that these statements cannot be taken as scientific 
evidence, in view of the long list of mistaken beliefs of this kind 
which history gives us. 

It may be of interest to look at such evidence as there is, and to 
discuss possible methods of investigating the truth of the general 
belief. In the only previous attempt to carry out such an evalua- 
tion that has come to hand, Landis (1938) has pointed out that 
‘before any sort of measurement can be made, it is necessary to 
establish a base line and a common unit of measure. The only 
"unit of measure available is the report made by the physician 
stating that the patient has recovered, is much improved, is 
improved or unimproved. This unit is probably as satisfactory as 
any type of human subjective judgment, partaking of both the 
good and bad points of such judgments.’ For a base line Landis 
suggests 'that of expressing therapeutic results in terms of the 
number of patients recovered or improved per тоо cases admitted 
to the hospital. As an alternative, he suggests ‘the statement of 
therapeutic outcome for some given group of patients during some 
stated interval of time". 

Landis realized quite clearly that in order to evaluate the 
effectiveness of any form of therapy, data from a control group of 
non-treated patients would be required in order to compare the 
effects of therapy with the spontaneous remission rate. In the 
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absence of anything better, he used the amelioration rate in state 
mental hospitals for patients diagnosed under the heading of 
‘neuroses’. As he points out, ‘there are several objections to the use 
of the consolidated amelioration rate... of the... state hos- 
pitals . . . as a base rate for spontaneous recovery. The fact that 
psychoneurotic cases are not usually committed to state hospitals 
unless in a very bad condition; the relatively small number of 
voluntary patients in the group; the fact that such patients do get 
some degree of psychotherapy especially in the reception hospitals; 
and the probably quite different economic, educational, and social 
status of the State Hospital group compared to the patients re- 
ported from each of the other hospitals—all argue against the 
acceptance of (this) figure . . . as a truly satisfactory base line, but 
in the absence of any other better figure this must serve.’ 
Actually the various figures quoted by Landis agree very well. 
The percentage of neurotic patients discharged annually as re- 
covered or improved from New York state hospitals is 70 (figure 
for the years 1925-34); for the United States as a whole it is 68 
(figure for the years 1926-33). The percentage of neurotics dis- 
charged as recovered or improved within one year of admission is 
66 for the United States (1933) and 68 for New York (1914). The 
consolidated amelioration rate of New York state hospitals, 1917- 
34, is 72 per cent; as this is the figure chosen by Landis we may 
accept it in preference to the other, very similar ones, quoted. By 
and large, we may thus say that of severe neurotics receiving in 
the main custodial care, and very little if any psychotherapy, over 
two-thirds recovered or improved to a considerable extent. ‘Al- 
though this is not, strictly speaking, a basic figure for “роп 
taneous” recovery, still any therapeutic method must show an 
appreciably greater size than this to be seriously considered.’ 
Another estimate of the required ‘base line’ is provided by 
Denker (1946). ‘Five hundred consecutive disability claims due to 
psychoneurosis, treated by general practitioners throughout the 
country, and not by accredited specialists or sanatoria,, were 
reviewed. All types of neurosis were included, and no attempt 
made to differentiate the neurasthenic, anxiety, compulsive, 
hysteric, or other states, but the greatest care was taken to 
eliminate the true psychotic or organic lesions which in the early 
state of illness so often simulate neurosis. These cases were taken 
consecutively from the files of the Equitable Life Assurance 
Society of the United States, were from all parts of the country, 
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and all had been ill of a neurosis for at least three months before 
claims were submitted. They, therefore, could be fairly called 
“severe”, since they had been totally disabled for at least three 
months’ period, and rendered unable to carry on with any 
“occupation for remuneration or profit” for at least that time.’ 
These patients were regularly seen and treated by their own 
physicians with sedatives, tonics, suggestion, and reassurance, but 
in no case was any attempt made at anything but this most super- 
ficial type of ‘psychotherapy’, which has always been the stock in. 
trade of the general practitioner. Repeated statements, every three 
months or so by their physicians, as well as independent investiga- 
tions by the insurance company, confirmed the fact that these 
people actually were not engaged in productive work during the 
period of their illness. During their disablement, these cases re- 
ceived disability benefits; as Denker points out, ‘it is appreciated 
that this fact of disability income may have actually prolonged 
the total period of disability and acted as a barrier to incentive for 
recovery. One would, therefore, not expect the therapeutic results 
in such a group of cases to be as favourable as in other groups 
where the economic factor might act as an important spur in 
helping the sick patient adjust to his neurotic conflict and illness.’ 

The cases were all followed up for at least a five-year period, 
and often as long as ten years after the period of disability had 
begun. The criterion of ‘recovery’ used by Denker is characterized 
by the following data: (1) Return to work, and ability to carry on 
well in economic adjustments for at least a five-year period. 
(2) Complaint of no further or very slight difficulties. (3) Making 
sof successful social adjustments. Using these criteria, which are 
very similar to those usually used by psychiatrists, Denker found 
that 45 per cent of the patients recovered after 1 year, another 
27 per cent after two years, making 72 per cent in all. Another 10, 
5, and 4 per cent recovered during the third, fourth, and fifth 
years respectively making a total of go per cent recoveries after 
five years." 

This sample contrasts in many ways with that used by Landis. 
The cases on which Denker reports were probably not as severe 
as those summarized by Landis; they were all voluntary, non- 
hospitalized patients, and from a much higher socio-economic 
stratum. (The majority of Denker’s patients were clerical workers, 


1 These percentages are plotted in Figure 1 to illustrate the process of 
recovery graphically. 
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executives, teachers, and professional men.) In spite of these differ- 
ences, the recovery figures for the two samples are almost identical. 
The most suitable figure to choose from those given by, Denker is 
probably that for the two-year recovery rate, as follow-up studies 
seldom go beyond two years, and the higher figures for threc-, 
four-, and five-year follow-ups would overestimate the efficiency 
of this ‘base line’ procedure. Using therefore the two-year recovery 
figure of 72 per cent, we find that Denker’s figure agrees exactly 
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with that given by Landis. We may therefore conclude with some 
confidence that our estimate of some two-thirds of severe neurotics 
showing recovery or considerable improvement without the bene- 


fit of systematic psychotherapy properly so-called is not likely to be 
very far out. 

We may now turn to the effects of psychotherapeutic treatment. 
The results of nineteen studies reported in the literature, covering 
almost eight thousand cases, and dealing with both psycho- 
analytic and eclectic types of treatment, are quoted in detail on 
page 30. An attempt has been made to report results under the 
four headings: (r) Cured, or much improved. (2) Improved. 
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(3) Slightly improved. (4) Not improved, died, discontinued 
treatment, etc. It was usually easy to reduce additional categories 
given by some writers to these basic four; some writers only give 
two or three categories, and in those cases it was of course 
impossible to subdivide further, and the figures for combined cate- 
gories are given. A slight degree of subjectivity inevitably enters 
into this procedure, but it is doubtful if this has caused much 
distortion. A somewhat greater degree of subjectivity is probably 
implied in the writer’s judgment as to which disorders and 
diagnoses should be considered to fall under the heading of 
‘neuroses’. Schizophrenic, manic-depressive, and paranoid states 
have been excluded; organ neuroses, psychopathic states, and 
character disturbances have been included. The number of cases 
where there was genuine doubt is probably too small to make 
much change in the final figures, regardless of how they are 
allocated. 

Certain difficulties have arisen from the inability ofsome writers 
to make their column figures agree with their totals, or to calculate 
percentages accurately. Again, I have exercised my judgment as 
to which figures to accept. In certain cases, writers have given 
figures of cases where there was a recurrence of the disorder after 
apparent cure or improvement, without indicating how many 
patients were affected in these two groups respectively. I have sub- 
tracted all recurrencies of this kind from the ‘cured’ and ‘improved’ 
totals, taking half from each. The total number of cases involved in 
all these adjustments is quite small. Another investigator making 
all decisions exactly in the opposite direction to mine would hardly 
alter the final percentage figures by more than one or two per cent. 

We may now turn to the figures as presented. Patients treated 
by means of psychoanalysis improve to the extent of 44 per cent; 

1 A number of studies have been excluded because of such factors as exces- 
sive inadequacy of follow-up, part duplication of cases with others included in 
our table, failure to indicate type of treatment used, and other reasons which 
made the results useless from our point of view. Papers thus rejected are those by 
Thorley & Craske (1950), Bennett & Semrad (1936), H. I. Harris (1939), 
Hardcastle (1934), A. Harris (1938), Jacobson & Wright (1942), Friess & 
Nelson (1942), Comroe (1936), Wenger (1934), Eitington (1922), Orbison 
(1925), Coon & Raymond (1940), Denker (1937), and Bond & Braceland 
(1937). Their inclusion would not have altered our conclusions to any consider- 
able degree, although as Miles et al. (1951) point out: ‘When the various studies 
are compared in terms of thoroughness, careful planning, strictness of criteria 
and objectivity, there is often an inverse correlation between these factors and 
the percentage of successful results reported.’ 
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patients treated eclectically improve to the extent of 64 per cent; 
patients treated only custodially or by G.P.s, it will be remembered, 
improve to the extent of 72 per cent. There thus appears to be 
an inverse correlation between recovery and psychotherapy; the 
more psychotherapy, the smaller the recovery rate! This conclu- 
sion requires certain qualifications. 

In our, tabulation of psychoanalytic results, we have classed 
those who stopped treatment together with those not improved. 
This appears to be reasonable; a patient who fails to finish his 
treatment, and is not improved, is surely a therapeutic failure. The 
same rule has been followed with the data summarized under 
‘eclectic’ treatment, except when the patient who did not finish 
treatment was definitely classified as ‘improved’ by the therapist. 
However, in view of the peculiarities of Freudian procedures 
it may appear to some readers to be more just to class these 
Cases separately, and deal only with the percentage of completed 
treatments which are successful. Approximately one-third of the 
psychoanalytic patients listed broke off treatment, so that the per- 
centage of successful treatments of patients who finished their 
course must be put at approximately 66 per cent. It would appear, 
then, that when we discount the risk the patient runs of stopping 
treatment altogether, his chances of improvement under psycho- 
analysis are almost equal to his chances of improvement under 


eclectic treatment, and slightly worse than his chances under G.P. 
or custodial treatment. 
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In general, certain conclusions are possible from these data 
hey fail to prove that psychotherapy, Freudian or otherwise 
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facilitates the recovery of neurotic patients. They show that roughly 
two-thirds of a group of neurotic patients will recover or improve 
to a marked extent within about two years of the onset of their 
illness, whether they are treated by means of psychotherapy or not. 
This figure appears to be remarkably stable from one investiga- 
tion to another, regardless of type of patient treated, standard of 
recovery employed, or method of therapy used. From the point of 
view of the neurotic, these figures are encouraging; from the point 
of view of the psychotherapist, they can hardly be called very 
favourable to his claims. 

In saying this, I do not mean to imply that the figures quoted 
necessarily disprove the possibility of therapeutic effectiveness. 
There are obvious shortcomings in any actuarial comparison, and 
these shortcomings are particularly serious when there is so little 
agreement among psychiatrists relating even to the most funda- 
mental concepts and definitions. Definite proof would require 
a special investigation, carefully planned and methodologically 
more adequate than these ad hoc comparisons. But even the much 
more modest conclusion that the figures fail to show any favourable 
effects of psychotherapy should give pause. We are left in the 
position where any belief in psychotherapy depends on faith, not 
on scientifically demonstrated fact. This faith may be justified or 
not; until its truth is demonstrated, however, it clearly cannot 
form part of science. That such a demonstration would be difficult 
cannot be denied; that it is essential for the future progress of 
therapy will be obvious. 

It may be asked what would be required in order to achieve a 
satisfactory proof. The basic requirements would appear to be: 
(1) A valid and reliable method of assigning patients to classes 
(neurotic, psychotic, organic, etc.) and subclasses (hysteric, anxiety 
state, depressive, schizophrenic, etc.). This assignment would of 
course have to be based on a satisfactory taxonomic or nosological 
system of classification. (2) A valid and reliable method of assessing 
degrees of disorder, so that a lesser degree of disorder could be 
correlated with the course of psychotherapy. (3) A control group 
of patients not treated at all during the period of the experiment, 
so that a base line could be obtained against which to compare the 
effects of therapy. Given these requirements, the problem appears 
a simple and straightforward one; if any one of the requirements is 
not forthcoming, the problem is insoluble at the scientific level. 

The third condition is difficult to fulfil because of social and 
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ethical objections; there is no systematic difficulty there that could 
not be overcome. It is conditions 1 and 2 which raise insuperable 
difficulties at present. Psychiatry has elaborated a system of classi- 
fication (or indeed a series of such systems) which appears to have 
little support in empirical studies, but is almost entirely based 
on clinical insight and unsupported speculation. In many cases, 
different systems contradict each other; thus the Freudians assume 
that normality, neurosis, and psychosis form one continuum of 
regression, while orthodox psychiatry appears to support the view 
that neurotic and psychotic disorders are as it were qualitatively 
different diseases which are independent of each other. Several 
other views are held by various groups, and there are many facts 
in contradiction to any hypothesis so far advanced. This problem 
is recognized by most psychiatrists, who put very little faith in 
diagnostic systems, and often shrugged off as being of little import- 
ance. Yet unless it is solved, it is difficult to see how such questions 
as the one regarding the effects of psychotherapy can be answered. 
'The importance of taxonomy is often underrated by those inter- 
ested in the dynamics (so-called) of human behaviour, yet without 
an adequate taxonomy: progress is difficult if not impossible in 
these more nebulous and even less easily accessible regions. 

Even if there were agreement on the main psychiatric group- 
ings, yet the reliability of assigning patients to these classes would 
not be regarded as sufficient by most psychologists. The work of 
Ash (1949), Doering (1934), Elkin (1947), and many others has 
shown how much disagreement there is among psychiatrists even 
when questions of major classification (neurotic, psychotic, normal, 
mental defective) are at issue; anyone familiar with case confer- 
ences will know the frequency of almost total disagreement with 
respect to major classification between equally competent psychia- 
trists. Of particular interest in this connection are some figures 
reported by the Research Branch of the Information and Educa- 
tion Discussion in the U.S. War Department, showing the almost 
incredible differences in diagnosis between psychiatrists working 
in different establishments (Stouffer, 1950). (A detailed discus- 
sion of these figures is given in Chapter III.) 

If we may then regard it as agreed that psychiatric diagnosis 
is of doubtful validity and low reliability, we may turn with interest 
to the clinical psychologists, in the hope that he may have used his 
Scientific training in a fruitful attempt to arrive at a useful, reliable, 


and valid taxonomy. Perusal of research contributions will soon 
S.S.P. D 
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dispel such optimistic thoughts. Far from improving on the some- 
what arbitrary criteria used by the psychiatrist, the psychologist 
has by and large succeeded merely in elaborating devices which 
will give rather low correlations with these psychiatric classifica- 
tions; in other words, he is content to take over the conceptualiza- 
tions of the psychiatrist, and tag behind in a fruitless endeavour to 
do rather less well what the psychiatrist could do without his aid! 
If this be indeed the function of psychology in its diagnostic aspect, 
then it is small wonder that so many psychologists have eschewed 
the Kraepelinian straight-jacket in order to indulge in Freudian 
manic spells which give them at least the illusion of usefulness. 
This example is intended to illustrate a very important general- 
ization, namely, that taxonomy, nosology, or classification lies 
at the very root of scientific progress, and that until taxonomic 
problems are solved in at least a preliminary way, scientific pro- 
gress towards answering more complex problems is barred. The 
history of science illustrates this again and again. Without the 
work of Ray and Linnzus biology could not have advanced as it 
did; Mendeleeff and his periodic table of the elements prepared 
the way for the fundamental advances in physics which culminated 
in the splitting of the atom. The importance of taxonomic concepts 
in the physical sciences is often neglected because they sometimes 
seem self-evident, and because their discovery frequently precedes 
recorded history; this hardly affects the argument, however. 


Measurement is essential to science, but before we can measure ме 
must know what it i 


discovery must pre 

Unfortunately, 
development; it ha. 
desire to be ‘dynamic’ 
study movement, one 
it is that moves. Wher 
invented ad hoc to sui 


vation at the dictate 
of fact to fiction has 
anatomy, for instance, 


taught that the bod ined ten systems 
of ducts bearing the te iis e ica e 


n fluid products of the body to all portions 


SCIENCE AND PERSONALITY 35 


of it. Ducts and fluid products were actually observed; ten was 
considered to be the perfect number. Anatomists consequently 
sought for, and believed they had found, structures which agreed 
with the philosophers’ notions of perfection. This type of taxonomy 
can hardly be considered scientifically useful. 

The rest of this book, then, is concerned with the problem of 
taxonomy in psychology; the question of classifying and isolating 
traits and types, and of measuring them objectively. In particular, 
our interest has centred around the following questions. (1) It is 
customary among psychologists and psychiatrists to study neurotics 
and psychotics because of the light these abnormal states are 
believed to throw on the functioning of normal personality. The 
assumption is made that in mental disorder we observe ordinary 
human behaviour ‘writ large’, as it were; the disease process is 
believed to serve as a kind of magnifying glass or microscope which 
enlarges what is normally invisible to the naked eye. This assump- 
tion may be well founded, but it has certain implications which are 
closely related to problems of taxonomy. If the main types of 
neurosis (hysteria and psychasthenia) are the prototypes of a classi- 
fication of normal human beings into extraverts and introverts, as 
Jung maintains; or if the main psychotic division into schizo- 
phrenics and manic depressives should be used for grouping normal 
human beings into types, as Kretschmer believes—then it must 
follow that neurosis and psychosis are not disorders qualitatively 
different from the ordinary run of personality differences found 
in the general population, but simply extremes on a continuum 
which includes everyone. This assumption is made explicitly by 
Kzetschmer and at least implicitly by Jung; many psychiatrists 
would hold an opposite view at least with respect to the psychotic 
states. Geneticists like Kallman, who believe they have isolated the 
genes responsible for the transmission of psychotic insanity, must 
inevitably hold the qualitative view, of course, and thus be added 
to the majority of orthodox psychiatrists. This question of quantita- 
tive continuum or qualitative difference is too important to be 
decided by subjective impression or majority vote, however, and 
experiment must decide which of these conflicting views presents 
us with a more accurate model of reality. It will be shown later 
that our experiments strongly support the quantitative continuum 
hypothesis, both in the case of neurotic and of psychotic disorders, 

(2) If we have, then, a continuum from normal to neurotic, 
and also from normal to psychotic, the question arises whether we 
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are dealing with two continua, as many psychiatrists would have 
us believe, or only with one continuum of *psycho-sexual regres- 
sion’, as Freud and his followers assert. As is well known, to Freud 
the neurotic has partly regressed from the adult psychosexual adjust- 
ment, while the psychotic has regressed to a considerably greater 
extent; thus we would on his hypothesis be dealing with a con- 
tinuum going from the normal, well adjusted person at the one end, 
through the partly regressed, poorly adjusted neurotic, to the com- 
pletely regressed psychotic at theother end. In terms of dimensional 
analysis, the question may be reworded in this form:—If we take 
three groups (normals, neurotics, psychotics), can these be repre- 
sented as occupying one dimension only, or are two dimensions 
required? It will be shown later that the Freudian theory has to be 
rejected, and that two dimensions are required to represent these 
three groups adequately. 

(3) We have so far discussed ‘neurotics’ and ‘psychotics’ as if 
these were homogeneous groups. Yet it is well known that sub- 
groups can be distinguished within the major groupings—hysterics, 
anxiety states, obsessionals, neurasthenics, and so forth within the 
neurotic complex, paranoids 
involutionals and so forth wit 


respect to these sub-groupings that typologies such as the Jungian 


deal with the questi 
problems of allocation of such groups as psyc 


called projective techniques, and have con- 
n much neglected lately 
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namely those types of test which I have called ‘objective behaviour 
tests’ (19502). 

It will be agreed by most psychologists that the subject-matter 
of a science of personality must be human behaviour. We may 
require intervening variables, hypothetical constructs, and theor- 
etical concepts such as instincts, drives, reflexes, complexes, atti- 
tudes, or fixations in order to bring some kind of order into the 
confused mass of facts which observation offers us, but nevertheless 
our starting point must be actually observed behaviour—provided 
we are careful to include all aspects of verbal, autonomic, and 
involuntary behaviour. Most psychologists would also agree, 
perhaps, with Wolfle's (1949) ‘fundamental principle of personality 
measurement’: ‘An individual reveals his own personality through 
any change he makes upon any type of material.’ Rephrased, this 
might read: An individual reveals his own personality through any 
observable item of behaviour. This would equate ‘personality’ with 
‘sum-total of behaviour’, and with the reservation that some items 
of behaviour are more important than others—in the sense that 
they show a greater range of intercorrelation and of ‘belonging- 
ness’, and are more widely predictive than others—this definition 
would probably not be objected to too vehemently. 

Personality study therefore starts with behaviour, and attempts 
to find general laws which will explain this behaviour; it must test 
these laws, and their predictive efficiency, against behaviour in 
situations relevant to the generalization embodied in the law in 
question. Measurement ofa particular sector of behaviour is usually 
undertaken by means of a test. Such tests are often classified as 
being either ‘projective’ or ‘psychometric’, but this division is 
hardly adequate to denote a given test. There are a number of 


principles of classification which together denote a given test with 
sufficient accuracy. 


(1) Classification according to Type of Stimulus 


Test material may vary according to the degree of organization 
or structure present in it. In highly structured material, the correct 
response is uniquely determined by the relations existing between 
the different parts of the test; in less well structured material, the 
discrimination of the parts, and their interpretation, often form the 
major task of the subject. This classification clearly does not imply 
a dichotomy; there are different degrees of organization or struc- 
ture from the typical intelligence test at the one extreme, through 
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the relatively well-structured Thematic Apperception Test, the 
less well-structured M.A.P.S. or Rorschach Tests, to the almost 
entirely unstructured Stern Cloud-Pictures or the Buck H-T-P 
test at the other. In general, we may say that the more firmly 
structured the test material or the test-situation are, the more 
restricted will be the variety of ways in which the subject can react 
to the test. ‘ 


(2) Classification according to Type of Reaction 


The two main ways in which the subject can react to the 
stimulus situation are the verbal and the non-verbal. These can them- 
selves again be classified, the verbal mode of reaction into written 
and spoken, and the non-verbal into autonomic and motor response. 


(3) Classification according to Mode of Response 


The two main modes of response are the creative and the selec- 
tive.. In the creative type of response, the subject is required to 
supply the solution to the problem quite unaided and entirely 
from within himself, while in the sclective type of response he is 
merely asked to choose one of several predetermined solutions. 
While the selective response method is used more frequently in 
connection with intelligence tests, and the creative response method 
in connection with tests of temperament and character, this cor- 
relation is far from perfect, and interesting advances have been 
made recently by using selective responses in connection with such 


tests as the Rorschach, the Word Association, and the Sentence 
Completion. 


(4) Classification according to Method of Scoring 


Scoring may be objective or subjective. Scoring is objective 
when only one symbol, numerical or otherwise, can be assigned to 
any particular reaction and when all observers would assign the 
same symbol to that reaction. Scoring is subjective when either 
there is considerable disagreement between different competent 
observers regarding the symbol to be assigned or when no attempt 
is made to assign a symbol at all, but only a general impressionistic 
interpretation is given of the subject’s performance. 


(5) Classification according to Interpretation of Score 


The final product of the subject’s reaction to the stimulus, 
whether it be verbal or non-verbal, creative or selective, may be 
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classified as being either symbolic or non-symbolic. Tests of fluency, of 
persistence, of suggestibility, of perseveration, or of level of aspira- 
tion give scores which are interpreted directly; others, such as the 
Frank Sexual Symbol test, the Dream Interpretation technique, 
or most drawing, painting, and play tests and techniques make use 
of symbolic interpretation of the products. Both methods of inter- 
pretation may of course be combined in dealing with the results of 
any one test. 


(6) Classification according to Mental Mechanism used 


Tests can be classified on the rather theoretical basis of the 
mental mechanisms believed to be employed in them. Thus, some 
tests are thought to call forth responses based on the principle of 
projection (such as the Thematic Apperception Test, for instance); 
others are based on the hypothesis that responses are determined 
by the principle of identification (the Szondi test); others still make 
use of reaction-formation, regression, and various other Freudian 
mechanisms. Clearly, little agreement can be expected on alloca- 
tion of tests according to this principle, particularly as experi- 
mental evidence for the very existence of the mental mechanisms 
in question is almost completely lacking. 

It is interesting to note in this connection that the one mental 
mechanism which ‘projective’ tests do not usually employ, in the 
opinion of experts in this field, is the mechanism of projection, 
Sargent (1945), in her review of ‘projective’ tests, makes a rather 
disingenuous defence of this confusing practice, as does Bell (1948) 
in his book on Projective Techniques. Semantic oddities of this kind, 
however, have not prevented the widespread use of this term. 


1 А word should be said here about the term ‘test’. A test is defined by 
Warren (1934) as ‘a routine examination administered to individuals belonging 
to the same group in order to determine the relative position of a given indi- 
vidual in the group with respect to one or more mental traits, motor abilities, 
etc., or in order to compare one group with another in these characteristics’. It 
will be seen that, if we accept this definition, a projection test becomes a con- 
tradiction in terms, as the ‘projective hypothesis’ seems to deny the very exist- 
€nce of these mental traits which are presupposed in the definition of the term 
‘test’. By and large, then, it would appear that the term ‘projection test’ has an 
unfortunate resemblance to the term ‘Holy Roman Empire’, which was not, 
Strictly speaking, an Empire, not entirely Roman, and very far from holy; 
Projection tests are not tests as the term is usually understood and they do not 
even claim to make use of the mechanisms of projection as defined by Freud and 
his adherents. The term is a bad one and should be dropped from psychology, 
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(7) Classification according to Method of Administration 


Tests may be administered in the form of individual or group 
tests; it is doubtful if this differentiation is of any fundamental 
importance. It is included here for the sake of completeness. 

The seven principles of classification briefly discussed above 
are relatively independent of each other; all possible coinbinations 
of type of stimulus, type of reaction, method of response, method 
of scoring, interpretation of score, and mental mechanism hypothe- 
sized may be found. Frequently the same test may be used in 
different ways by different investigators, changing its classification 
under some of these headings. 

Two techniques are used almost exclusively in current practice 
of personality testing. They are the questionnaire—well structured, 
with verbal reactions, selective response, objectively scored, and 
not symbolically interpreted—and the so-called projective tests— 
little structured, with verbal reactions, creative response, subjec- 
tively scored, and often symbolically interpreted. It will be noted 
that both procedures duplicate psychiatric methods, and may 
under special circumstances be substituted for them. The question- 
naire duplicates the psychiatric interview; the ‘projective’ test 
duplicates the interpretation of dreams and symptoms by the 
psychiatrist. Both are essentially verbal in nature. 

While in our research we have not neglected these procedures, 
we have laid most stress on what might be called ‘objective 
behaviour tests’. These tests are highly structured, call for non- 
verbal reactions—either autonomic or motor, are objectively 
scored, and require no interpretation. This type of test has 
appeared to us to be in the tradition of behaviouristic psychology. 
It seems premature for clinical psychologists to throw over what is 
most distinctive in their intellectual heritage, in order to don with 
unseemly haste the Emperor’s New Clothes. It is our belief—and 
some support for it will be found in the pages that follow—that 
objective behaviour tests tap fundamental, constitutional depths 
of personality organization which questionnaires and ‘projective 
tests’ do not reach, while conversely these psychiatric types of tests 
are highly reflective of passing moods, sentiments, feelings, and 50 
forth. As we are concerned more with the underlying dimensions 
of personality than with the small ripples on the surface, our choice 
may perhaps be intelligible. 


In one other respect will our work be found to deviate from 
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that of many others in this field. Students of personality have often 
used personality ratings, made by psychiatrists, teachers, super- 
visors, or laymen as an aid in the classification of human traits. 
Outstanding examples here are, for instance, Burt (1937), Cattell 
(1946), and Moore (1930). In our opinion the value of ratings 
cannot compare with that of tests; tests are measured items of 
performarice, while ratings imply judgments formed on the basis 
of personal impression of performances inaccurately perceived, 
sampled under biased conditions, and evaluated according to 
unknown standards. Quite frequently, the very traits rated are 
selected according to criteria which would encounter severe criti- 
cism; Burt’s work, for example, is based on the rating of the strength 
of McDougallian ‘instincts’ in children by teachers. The choice of 
a system of classification some thirty years out of date, and dis- 
credited in contemporary psychology for many valid reasons, 
makes this work interesting more as a statistical exercise than as a 
genuine contribution to the study of personality. Cattell's work is 
not subject to this criticism, and it must be admitted that he has 
seen quite clearly that the value of syndromes and factors derived 
from ratings is dependent on verification through the use of other 
methods. Nevertheless, until such verification is forthcoming—and 
it will be noted that some of the results reported in this volume 
agree well with Cattell’s analysis—conclusions based on ratings 
must at best be regarded as suggestive only, and as possibly telling 
us more about the rater than about the ratee. 

It might be retorted that in our own work ratings have played 
an important part also. This is truc, but it should be pointed out 
that our use of ratings and diagnoses has been fundamentally 
different from that of the above-mentioned authors. We have used 
ratings as a kind of scaffolding, to aid in the construction of a solid 
edifice based on objective measures of behaviour; once this edifice 
is constructed the scaffolding may be taken down. It plays its part 
merely as a transitional aid; it is not intended to replace the 
finished building. Just how this method works is described in detail 
in the following chapter. 


Chapter Two 


THE DIMENSIONAL APPROACH 


the taxonomic problem in personality research, then we are 

involved automatically in the problem of finding appropriate 
dimensions of personality. And for a method to aid in the solution we 
must turn to factor analysis, because in spite of the acknowledged 
difficulties and weaknesses of this method there does not exist, at 
the present stage of our knowledge, any other method which could 
aid us in our quest (Eysenck, 1949). This chapter will be devoted 
to a discussion of the relation between factor analysis and dimen- 
sional research, as well as to an investigation of certain criticisms 
often made of the factorial method. It will also contain an element- 
ary introduction to the method of criterion analysis which has been 
introduced to overcome some of these criticisms, and to bring 
factor analysis into closer touch with the hypothetico-deductive 
method (Eysenck, 1950).! 

All through this book, then, we are concerned with problems 
of psychological dimensions. These problems are absolutely funda- 
mental in science. If science depends on measurement, we must 
know what to measure. Thus we cannot make direct comparisons 
of magnitude between things which are qualitatively unlike, i.e. 
which do not lie along one dimension. We cannot form a dimen- 
sionally inhomogeneous equation, such as “23 hours + 14 horses 
= 20 sacks of potatoes + 17 miles’ without violating common 
sense as well as physical propriety. Yet similarly dimensionally 
inhomogeneous equations are formed constantly by psychologists 
and psychiatrists because of our fundamental lack of knowledge 
regarding ‘dimensional homogeneity’. 

The reader may object that choice of dimensions in psychology, 


I: our main task is to provide at least a provisional solution to 


! The first part of this ch. 
plified level in order that th 
without a background in 
already fully familiar with 


apter has on purpose been written at a very sim- 
€ rest of the book might be intelligible to readers 
factor analysis. It should be omitted by readers 
factorial techniques and their logical basis. 
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particularly if such choice is dependent on statistical procedures 
such as factor analysis, is arbitrary, thus setting psychology off 
from the remainder of science. This implies a profound misunder- 
standing of the procedures of physical science. In a very real sense, 
we may say that in physics ‘the choice of dimensions is arbitrary’ 
(Scott-Blair, 1950). ‘If we are moving towards or away from a 
source of tight or sound, the colour or pitch appears to change. It 
is easy to calculate our speed from the ratio of change in wave- 
length to normal wave-length, which is, of course, a dimension- 
less number . . . we should be quite justified in defining velocity by 
this number instead of as a length divided by a time.’ “There is 
nothing absolute about dimensions . . . they may be anything con- 
sistent with a set of definitions which agree with the experimental 
facts’ (Bridgman, 1931). 

If, then, dimensions are in a sense arbitrary in physics, it seems 
unreasonable to expect factor analysis or any other statistical pro- 
cedure to give us psychological dimensions which are not up to a 
point arbitrary. All possible systems of dimensions to describe a 
given set of facts must be convertible into each other, as they all 
must agree with the experimental facts; if two systems of dimen- 
sions disagree an empirical test becomes possible to decide which 
of the two leads to deductions verifiable by experiment. Whenever 
two factor analysts disagree in their analysis of a given table of 
intercorrelations, it is possible either (1) to convert one set of 
factors into the other through a set of intermediate equations, thus 
showing that these are merely alternative dimensional systems 
equally adequate to represent the facts, or else it is possible (2) to 
show that one solution is statistically unsound, or leads to deduc- 
tions which can be disproved. The argument between Spearman 
(1927) and Thurstone (1935) was of the latter kind, leading toa 
disproof of Spearman’s original position; most arguments in the 
literature, however, are of the former kind. 

To acknowledge that one’s choice of dimensions is arbitrary 
does not mean, of course, that any set of dimensions may be chosen. 
In practice, the restriction imposed by the requirement that dimen- 
sions ‘may be anything consistent with a set of definitions which 
agree with the experimental facts’ rules out all but a very few alterna- 
tive sets of dimensions, and as facts accumulate choice becomes 
very restricted indeed. Even to find one single set of dimensions to 
embrace all the known facts of personality research may appear a 
tall order; to find several such sets would tax the imagination of 
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most psychologists unduly. The undoubted existence of st SCH 
of dimensions, proposed by the various schools, is accounta eg 
the main by the very simple procedure of unmercifully rcjecti g 
all facts not fitting into a given scheme. If psychologists followe : 
the example of physicists and rejected any model which ker 
clearly in opposition to experimentally verified facts, the gr oun 
would be cleared of much rubbish the main effect of which is to 
hinder the advance of psychological knowledge. . 

Science, as we have seen, attempts to describe the noi 

world of experience through the formulation of abstract laws one 
the creation of abstract categories. This process of abstraction 1s 
absolutely fundamental to science; without abstraction there can 
be nothing but observation of particular occurrences. As Whitehead 
(1929) puts it, ‘the paradox is now fully established that the utmost 
abstractions are the true weapon with which to control our thought 
of concrete fact. To be abstract is to transcend particular concrete 
occasions of actual happenings. The construction with which the 
scientist ends has the neatness and orderliness that is quite unlike 
the varied and multiform world ofcommon 
grows out of and returns to 
be a precise connection bet: 
which is the goal of scienc 
common sense.’ 

There is of course an abundance 
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questions regarding the ‘real existence’ of the ether, or criticisms 
of the concept of the electron as being a ‘statistical artefact’. 
Concepts are useful in science in so far as they help in introducing 
order into a confused field; they cease to be useful when they fail 
to account for all the relevant phenomena, or when they cease 
to give rise to accurate predictions regarding new and hitherto 
unobserved phenomena.* 

Let us start our discussion of factor analysis with a simple and 
straightforward example, in which we shall as always pay attention 
to the scientific logic of the method rather than its mathematical 
foundations. Let us assume, as Spearman did in his revolutionary 
article in 1904, that underlying all cognitive tasks there is one 
general ability, which we may call ‘g’ or ‘intelligence’. Let us also 
assume that different people possess this ability to varying degree, 
and that different tasks call for this ability in varying degree. Let 
us make one further assumption, namely, that there is no other 
ability required in order to succeed in these tasks except ‘g’ and 
something which is specific to each particular task—a specific 
ability which we may call s, for task І, 5, for task 2, and so on to 5, 
for task n. 

Now if by some magic we could obtain an accurate measure of 
the ‘intelligence’ ofa random sample of people, it would be possible 
to correlate this measure—which we may denote by a capital “С? 
—with each of our tests. Let us suppose that we have four tests— 
A, B, C, and D—and that the correlations of these tests with ‘G’ 
are :9, ‘8, -7, and :6 respectively. It can be shown that, given our 
assumptions, the correlation between any two tests is given by the 
product of their respective correlations with “С”. In other words, 
tests A and B will correlate -9 x :8 = 72, tests C and D will 


1 We may in this connection draw attention to the procedure used by 
Einstein in constructing his theory of relativity. In the first edition of his General 
and Special Theory of Relativity, Einstein based his physical interpretation of the 
universe on Minkovsky's and Riemann's geometrical premises. Thus, he estab- 
lished for all space-time momenta, including their electro-magnetic behaviour, 
a hypothetical, geometrical system of parameters: Жу, Xp, Xa, X,. Jt is neither 
essential for these parameters to have actual existence in this universe nor for them to 
correspond to our sensory perception. They are assumed in a purely mathematical 
sense as tools that enable us to explain what we know as the physical universe, 
During the year 1933 Einstein adopted the five parameter system (Xn X, 
X, Х,) because the previous system failed to explain certain physical momenta, 
thus requiring the introduction of a new parameter. No physicist would take 
seriously the question: ‘Have Ху, X, etc. any real existence?’ 
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correlate -7 x D = -42, and so forth. This may best be shown in 
the form of a table (Table I): 


TABLE I 
Correlations Test A Test B Test C Test D 
with С? 9 * 7 E 
9 (81) 72 63 54° 
8 72 (-64) 156 48 
7 3 “56 (-49) 42 
6 54 48 42 (:36) 


The two-digit values in the central part of the table form what 
is known as a matrix, that is to зау a rectangular array of figures. 
It will be seen that the-diagonal values are put in brackets; this is 
done because it is impossible to correlate a test with itself. We can 
of course correlate parallel forms of a test, or repeat the adminis- 
tration of a test, or use split halves. None of these methods, how- 
ever, would give us values dependent only on ‘g’; they would all be 
influenced by the specific ability required by the test in question, 
by memory, practice, and all sorts of other factors. Consequently 
they are estimated by analogy with the other values in the matrix, 
ie. they are formed by multiplying the “С? correlation of a test 
with itself, i.e. by squaring it. In this way, «9 x `9 = :81 for test A, 
and so forth. These values are not, as are the other values in the 
table, subject to empirical check. They form part of the general 
hypothesis we are investigating. 

Let us look at this table a little more closely. All the values 
in column D will be seen to be proportional to those in column C; 
63/54 = 56/48 = -49/-42 = 42/:36 = 1-167. Similarly, any 
other pair of columns is proportional. This fact can be put in many 
different ways. Let us take the intercorrelations of two tests—say 


A and D—with two other tests—say B and C. These correlations 
are: 


"72 63 

48 42 
By the proportionality rule, `72/-48 = -63/-42. This can also be 
expressed in the form: ‘72 X742 =-48 x -63, which is identical 
with (-72 x 42) — (48 х -63) = o. This last expression is known 
as the tetrad criterion; it expresses the general rule that if in a 
given set of tests the assumptions made above are borne out in 
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fact, then any tetrad (set of four correlations) formed in this 
manner must come to zero within the limits of the sampling 
error. 

We can now dispense with the idea that by some magic we had 
managed to obtain an accurate measure of the ‘intelligence’ of the 
persons in our sample, a measure which enabled us to calculate 
the correlations of ‘G’ with tests A, B, C, and D. We start out, as 
in fact we always must do, with the matrix of intercorrelations, as 
actually observed on a sample of the population. We also start out 
with an hypothesis—the hypothesis that these intercorrelations are 
due to one factor (‘g’) only, and that there is nothing in common 
between any two tests that cannot be accounted for in full by this 
factor. We can then deduce, in the manner shown above, that if 
our hypothesis is indeed correct, then it would follow that the 
observed intercorrelations should show a certain pattern of pro- 
portionality—in other words, that all the possible tetrads should 
come to zero. Having verified this deduction, we may wish to 
know how closely each of our four tests correlates with this 
hypothetical factor ‘g’. This can be calculated very easily in a 
number of ways. Quite obviously, if we knew the diagonal values, 
we could immediately calculate the required correlations by taking 
in each cast the square root of the diagonal value; V-81 = E 
V-64 = 8, and so forth. Unfortunately, the diagonal values are 
not given to us. But we may deduce them from a knowledge of the 
proportional arrangement of the whole table. We can form a 
tetrad in which one of the values is the sought-for diagonal value, 
which as it is unknown we may denote by an X. We then get 
equations of the kind: 


(X x 48) — (72 x :54) =0 

from which we can then calculate that X — :81, and that conse- 
quently the correlation between ‘G’ and A is equal to o. This 
correlation is customarily referred to as the ‘saturation’ of test A 
with factor ‘g’, or more simply test A’s factor saturation. The factor 
Saturations of the other tests can be calculated in a similar manner. 

So far, we have been dealing with a purely hypothetical 
€xample. To lend an air of verisimilitude to the proceedings, let us 
look at the actual correlations observed between four ‘tests’ of 
neuroticism (Table II). Each of these ‘tests’ is a questionnaire, 
Carefully designed to measure a different area of this general field. 
Test A measures psychosomatic complaints, test B measures child- 
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not work through multiple regression and other orthodox pro- 
cedures in order to get maximum correlation between tests and 
criterion? 

Let us take an example. In the matrix of correlations below 
(Table IV) let A stand for a conventional test of intelligence, and 
B, C, and D for ratings made by three teachers of the intellectual 
ability of a group of children known to them, all of whom have 
been given the test. From these intercorrelations we can show that 
the hypothesis of a single factor as being responsible for all the 
observed values is tenable, and we can proceed to calculate the 
factor saturations of test and teachers ratings alike. These 


are shown at the top and at the left side of the matrix of inter- 
correlations. 


rs’ ratings that is not also 
other words, test and ratings alike are 


giving an estimate of the same underlying, hypothetical trait of 


intelligence. That being s 
to the extent indicated 
teachers’ ratings are rath 
we average the teachers’ ratings 
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criterion and rely exclusively on our test, as giving us a better 
estimate of a child's intelligence than do the ratings. (In practice, 
of course, many more tests, teachers, and children would have to 
be used before such a conclusion could be accepted. We are only 
interested here in the logic of the argument, not in its practical 
aspects.) 

Another method of looking at the question of the criterion 
suggests itself. In Table II we gave the intercorrelations of four 
tests of neuroticism, and calculated their saturations for this factor. 
It is possible to analyse the differences in score on each of these 
four tests for a normal and a neurotic group, and to compare the 
success with which each test accomplishes the task of segregation 
with its factor saturation. If the factor we have isolated is really 
identified correctly as one of neuroticism, then we should expect 
the ability of a test to discriminate between normals and neurotics 
to be proportional to its saturation for the factor; the test with the 
highest factor saturation should give the best discrimination, the 
test with the second highest saturation the second best discrimina- 
tion, and so forth. Table V shows percentage of normals and 
neurotics having ‘neurotic’ scores on each of the tests, difference 
between the two percentages, biserial correlation of each test with 
the ‘normal versus neurotic’ dichotomy, and the factor saturations 
previously given. 


TABLE V 


Biserial Factor 


Normals | Neurotics | Difference с л tions Saturations 


76 % 
Psychosomatic complaints 29 89 60 “66 83 
Childhood symptoms 20 53 33 “38 67 
Acceptance of soldier role 31 59 28 KÉ +56 
Sociability 16 45 29 "33 46 


It will be seen that the column giving the factor saturations is 
directly proportional to the column giving the biserial correlations 
of each test with the dichotomy ‘normal versus neurotic’, as 
required by our hypothesis. As this column of correlations between 
tests and criterion will play a considerable part in our argument 
later, we may perhaps give it a special name and call it the 
‘Criterion Column’. At the moment, let us merely note the fact 
that this proportionality is unlikely to have arisen by chance; that 
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it is in line with our prediction based on a certain interpretation 
of our factor; and that our faith in the accuracy and correctness of 
our interpretation is therefore considerably strengthened. We have 
not proved that our hypothesis is correct; we have merely shown 
that a relatively large number of facts can without distortion be 
subsumed under one consistent set of concepts. 

In doing so, however, we have made an assumption which is 
not supported by the data so far given. We have assumed that 
‘neuroticism’ is a trait which forms a continuum from the ‘normal’ 
to the ‘neurotic’ end, and that our groups of normals and neurotics 
are merely random samples chosen from different points of this 
continuum. We may present this assumption diagrammatically, 
as in Figure 3, letting the ordinate represent the hypothetical 


+ x Y 


NORMAL A NEUROTIC 


Figure 3.—Hypothetical ‘Neuroticism’ Continum 


continuum, and plotting the number of people falling at each 
point along the abscissa. We may assume a normal distribution 
for the sake of the argument; in actual fact the particular form of 
distribution is not of any importance. Points near the plus end of 
our hypothetical continuum represent well integrated, emotion- 
ally stable, non-neurotic personalities; points towards the minus 
end of the hypothetical continuum represent poorly integrated, 
emotionally unstable, neurotic personalities. At point A we have 
drawn a line to indicate that people located to the right of this 
point are particularly liable in our society to be referred to 
psychiatrists, to be labelled ‘neurotic’, and to be treated by means 
of a variety of medical and psychological methods. Chance errors 
will ensure that some people to the right of this point will not in 
fact be so diagnosed and treated, although strictly speaking they 
ought to be, while some people to the left of point A, particularly 
those in its close proximity, will be diagnosed as neurotics, al- 
though they deserve this label less than others who manage to 
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avoid it. But by and large such chance errors will only attenuate 
the accuracy of this division into two classes; they will not 
invalidate it. 

This hypothesis should be contrasted with another hypothesis 
which is not contradicted by any of the data we have so far con- 
sidered. It may be maintained, and in fact it is often maintained 
both by popular opinion and by some psychiatrists, that ‘neurotics’ 
are qualitatively different from ‘normals’. This view would recognize 
no intervening stages; a person is either a neurotic, or he is not. 
In some cases this clear qualitative difference is obscured, to 
all appearances, by other factors overlaying this fundamental 
dichotomy, but when these are cleared away a person is found to 
fall quite definitely into one or the other of these two categories. 
Such categorical or qualitative thinking has always characterized 
early stages of scientific progress; usually it is shown later that 
what was considered qualitatively different at one time is really 
reducible to quantitative variation along some kind of continuum. 
What is required is an experimentum crucis, i.e. an experiment in 
which deductions are made from the two contrasting hypotheses 
which are mutually contradictory, so that one set of deductions 
can definitely be shown to be false. Again, this does not prove that 
the other hypothesis is correct; it merely adduces evidence in its 
support. 

On the basis of the qualitative hypothesis, we would assume 
that with respect to whatever it was that differentiated normals 
from neurotics, both populations would be homogeneous; in other 
words, there would be no gradations of ‘normality’ or ‘neuroti- 
cism'. This in turn would imply that tests which differentiated 
between normals and neurotics would not show any particular 
pattern of intercorrelations within the normal group only, or 
within the neurotic group only. On the average, these inter- 
correlations should amount to zero. 

On the basis of the quantitative hypothesis, our prediction 
would be quite different. We may approach this matter in the 
following way. Let us consider two tests, À and B, both of which 
are known to discriminate significantly between normals and 
neurotics. Now let us take our group of normal subjects only. If 
our quantitative hypothesis is correct, it follows that this normal 
group may be subdivided into two parts at some arbitrary point X. 
Those to the left of X are relatively more stable, better integrated 
people than those to the right of X. It would follow that they 
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should do better on both tests A and B than those on the right. 
But if there is a tendency for some people to do well on both tests, 
and for others to do poorly, then quite clearly these two tests 
would be found to be correlated. On the basis of our quantitative 
hypothesis, therefore, we would expect A and B to be positively 
correlated. Generalizing this to n tests, we may say that any group 
of tests which discriminate between normals and neurotics should 
show positive intercorrelations when only the normal group is 
considered. The same considerations, of course, apply to the 
neurotic group. This group also may arbitrarily be divided into 
two at an arbitrary point Y, and again it can be shown that on the 
basis of qur quantitative hypothesis positive intercorrelations 
would be expected. 

We go one step further. Not only should the intercorrelations 
(within either the normal or the neurotic group) be positive; they 
should be proportional to the ability of the respective tests to dis- 
criminate between normals and neurotics. If a test discriminates 
well between normals and neurotics, then it should intercorrelate 
highly with other tests; if it discriminates poorly, then it should 
show low intercorrelations. In other words, if we take our matrix 
of intercorrelations for the normal group and factor analyse it, 
then our factor saturations should be proportional to the correla- 
tions of the tests with the normal versus neurotic criterion. 
Similarly, if we take our matrix of intercorrelations for the neurotic 
group and factor analyse it, then our factor saturations should be 
proportional to the correlations of the tests with the normal versus 
neurotic criterion. As a corollary, it follows that the factor satura- 
tions derived from the two matrices (the normal and the neurotic) 
should be proportional to each other. 

An example may make this argument clearer. In Table II we 
have given the intercorrelations and the factor analysis of four 
tests of neuroticism, for a mixed normal and neurotic population. 
Below are given the matrices of intercorrelations for normal and 
neurotic subjects separately. Results of a factor analysis are also 
given in each case. 

In Table VIII are given the factor saturations for the four tests 
for the normal and the neurotic groups separately and the 
Criterion Column (i.e. the column of biserial correlations between 
tests and the dichotomy normal versus neurotic). It will be obvious 


to the casual observer that the three columns show a distinct 
proportionality, 
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TABLE VI 
NORMAL GROUP 
-80 52 50 +28 
Do (-64) “41 46 "19 
+52 41 (:27) +20 316 
> 50 46 "20 (25) 15 
28 19 16 15 (-08) 
TABLE VII 
NEUROTIC GROUP 
*65 *68 42 42 
65 (-42) 44 28 +28 
68 44 (46) E ER 
42 28 :30 (:18) 38 
42 :28 31 18 (18) 
TABLE VIII 


Factor Saturations: 
Normal Group 


Factor Saturations: 


Criterion Column Neurotic Group 


66 ‘65 
:38 68 
'85 KK 
'33 42 


In practice, of course, no conclusions should be based on 
results from four tests, and on simple casual observation of degree 
of proportionality. We may therefore broaden the scope of our 
example and present data from a rather larger sample of tests. 
Our example was originally taken from an investigation employ- 
ing fifteen tests altogether (Stouffer, 1950). We have calculated, 
from correlations given in the original report, factor saturations 
for the normal group and factor saturations for the neurotic 
group, and compared these with the criterion column. The full 
results are set out in Table IX. Again, it will be seen that the 
factor saturations are proportional to each other, and to the 
Criterion Column. This proportionality can best be expressed in 
the form of a correlation coefficient. The two columns of factor 
saturations correlate together -82; the Criterion Column correlates 
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with the factor saturations for the normal group -88, and with the 
factor saturations for the neurotic group ‘81. All three values are 
highly significant, and are clearly different from the zero values to 
be expected on the basis of the qualitative hypothesis. We may 
therefore conclude that our data lend support to the quantitative 
hypothesis, and refute the qualitative hypothesis. 


TABLE IX 
Criteri Factor Factor 
Title of Scale G li non | Saturations: Saturations: 
TE Norma] Group | Neurotic Group 
1. Psychosomatic complaints 66 *69 (+15) +56 (+16) 
2. Childhood neurotic symptoms -38 49 (-09) 58 (— 01) 
3. Personal adjustment (—) 42 :67 (-09) *68 (-05) 
4. Over-sensitivity *89 48 (745) *56 (+50) 
5. Childhood fears 33 42 (-16) "52 (:02) 
6. Acceptance of soldier role (—) Kb :58 (—-15)| -48 (—-29) 
т. Worrying ?7 | 59 (02) | +56 (^09) 
8. Sociability (—) ES :33 (-03) -56 (-08) 
9. Participation in sports (—) -28 28 (—-14)| -34 (— 26) 
10. Identification with war (—) “12 40 (—-02)| 531 (— 00) 
тт. Childhood fighting behaviour (—)| -18 :30 (—:32)| +17 (—:59) 
12. Childhood school adjustment (—) чї 15 (oni “19 (-06) 
13. Relations with parents “12 по (-06) -20 (+20) 
14. Emancipation from parents (—) +08 :20 (—:34)| +18 (— 24) 
15. Mobility on |—-04 (31) -05 (-24) 


Figures in brackets are referred to on page 58 and should be omitted at 
first reading. Tests whose scores have been multiplied by — 1 in order to make 
coefficients in the Criterion Column positive are marked thus: (—). 


So far, we have been concerned with patterns of intercorre- 
lations which show the particular hierarchical or proportional 
features which are summed up in saying that the tetrad differ- 
ences vanish. As it happens, however, matrices showing this 
particular pattern are quite rare, and in the majority of cases the 
hypothesis that only one factor is responsible for the intercorrela- 
tions can be shown to be false. When that happens we move over 
into rather more difficult territory, and begin to deal with multiple 
factor analysis, i.e. we attempt to account for our observed correla- 
tions in terms of several independent factors. 

, Let us again begin with an example. Below (Table X) are 
given the intercorrelations of four tests (questionnaires) dealing 
with lack of personal adjustment, oversensitivity, lack of emanci- 
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pation from parents, and mobility. No factor analysis was carried 
out, and no diagonal values are therefore given. 


TABLE X 
A | B | e | р 
> A ==: 49 "II "04 
B 49 — 05 “12 
с “II 05 — — :28 
D 04 12 — 28 — 


It will be seen that the negative correlation between tests C and 
D completely upsets any attempt to account for the correlations in 
this table in terms of one factor common to all the tests. Except 
for it, all the other tests correlate together positively, showing that 
they all measure something in common— presumably the same 
factor of ‘neuroticism’ we found before. Then how can we account 
for the presence of this negative correlation? 

Perhaps the nature of the tests will give us a clue. ‘Emancipa- 
tion from parents’ is defined by the authors of the test as involving 
both actual physical emancipation involved in leaving home, and 
the psychic emancipation implied by the establishment of satis- 
factory heterosexual relationships. ‘Mobility’ denotes unstable em- 
ployment record and much geographical moving around. Both 
the ‘mobile’ and the ‘non-emancipated’ tend to be more neurotic; 
yet the correlation between the two tendencies is negative. Clearly 
another factor is at work over and above ‘neuroticism’, and it may 
be surmised that there are two kinds of neurotic: the type that is 
caught up in mother’s apron strings, is shy, sensitive, worrying, 
‘dysthymic’, anxious, and depressed; and the more hysterical type 
that has no difficulty in breaking emotional bonds (perhaps be- 
cause they are never very firmly established?), that has never 
settled down, has little persistence and seldom sticks long at any- 
thing once the initial enthusiasm has worn off. In other words, 
we may here be dealing with a manifestation of the introvert- 
extravert dichotomy as popularized by Jung (1923), which 
stemmed from just such a clinical differentiation, made originally 
by Janet (1909), of the main neurotic syndrome. | : 

This hypothesis was suggested by our small matrix of inter- 
Correlations; clearly it requires some proof before it can be taken 
Very seriously. If it were correct, what other tests would we expect 
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to correlate with this hypothetical introvert-extravert continuum? 
The introverted, non-emancipated child would presumably not 
have done as much childhood fighting as the extravert; he would 
not have participated in sports to the same extent; he would have 
greater difficulties in assuming the soldier role. The extravert 
might be expected to show, in addition to greater mobility, a 
tendency to have less close relations with his parents, and to be 
easily offended and resentful of criticism (grouped together by the 
authors of the test of ‘oversensitivity—perhaps a somewhat mis- 
leading term). 

Let us return to Table IX. We have extracted a general factor 
of ‘neuroticism’ from the intercorrelations of fifteen tests for normal 
and neurotic groups separately. Perhaps we have left something 
unaccounted for in the original tables of correlations. Let us there- 
fore form our product matrices and subtract them from our 
original matrices of observed correlations. These new residual 
matrices, i.e. matrices consisting of correlations left over after the 
first factor has been extracted, may be factor analysed in a similar 
way to that used on our original matrix. When we do that, we 
obtain the results given in Table IX, in brackets. It will be noticed 
that the new factors (one from the normal, the other from the 
neurotic group) are bipolar, that is to say, they have both negative 
and positive saturations. This is an inevitable feature of the method 
of extraction to which we shall return again; here let us just note 
that very roughly our anticipation is borne out. We find that 
*emancipation' is at the opposite end of the scale to *mobility', and 
that roughly at least the other tests are grouped as predicted. This 
second, bipolar, factor is not likely to be a mere chance pheno- 
menon, because it appears in a very similar form in two entirely 
independent factor analyses—the one for the normal, the other for 
the neurotic group; the correlation between the two factors is 91. 

This example is not presented as proof of the existence of an 
introversion-extraversion factor; quite clearly there are too many 
steps missing in the argument, and too much subjectivity in the 
reasoning, to allow us to reach any such far-reaching conclusions. 
What we can conclude, however, without much doubt attaching 
to our argument, is that more than one factor is required to 
account for the observed correlations. Does this requirement affect 
in any way the methodological considerations discussed in con- 
nection with the case of the ‘single factor’ hypothesis? Unfortun- 
ately the breakdown of the simple model necessitates a somewhat 
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complex rethinking of the logic underlying factor analysis, and to 
this we must next turn. Before doing so, however, we must intro- 
duce a simplified, diagrammatic method of presenting the results 
of factorial analyses which will be indispensable for our argument 
if we wish to avoid complex mathematical expressions. 

Let us go back to Table IX, in which were given two factors 
extracted. from each of the two matrices of intercorrelations 
observed, i.e. that derived from the normal and that derived from 
the neurotic population. Consider only the values obtained from 
the normals. Casual observation shows, and it can be proved by 
actual calculation, that the two factors we extracted from this 
matrix are independent of each other, i.e. they are not correlated. 
The fact that a test has a high saturation on one factor does not 
enable us in the slightest to predict whether its saturation on the 
other factor will be high, low, or negative. This notion of inde- 
pendence can be shown to correspond to the concept of ortho- 
gonality (being at right angles to each other) in geometry. We 
may therefore plot our two factors at right angles to each other, as 
in Figure 4. Each line represents a factor, the horizontal line the 
first factor, the vertical line the second factor. Both are equal in 
length, and degrees of saturation are marked off on each, going 
from + 1:00 through ʻoo to — 1:00. A circle is drawn around this 
structure to delimit what is called the ‘two-factor space’. It is of 
course easy to add other factors to this structure; thus a third 
factor might be added, at right angles to the other two, which 
would stick out from the intersection of factors one and two (the 
origin) at right angles to the plane of the paper. Factors beyond 
three cannot be represented visually, but can easily be treated 
mathematically according to the rules of n-dimensional geometry; 
factors are then considered to lie in hyper-space. А 

So far we have only drawn the framework within which our 
observed factors can be plotted. Test 1 has saturations of -69 and 
‘15 on the two factors; consequently we plot it in Figure 4 in the 
corresponding position. Tests 4 (with saturations of -48 and A9 
6 (with saturations of -58 and — 15), and 14 (with saturations 
of 20 and — -34) have also been plotted in the diagram. The 
other tests are not plotted because they would obscure certain 
important relations which are vital to an understanding of the 
implications of this graphic method of representation. 

Let us connect the points marking the positions of tests 4 and 14 
to the origin by means of straight lines. It can be shown by simple 
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trigonometric calculation that if we multiply the cosine of the 
angle separating these two lines (angle «) by the product of the 
lengths of these two lines, we obtain the correlation between the 
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representing tests 1 and 6. In each case, the correlation between 
any two tests is represented by the product of the length of the 
lines connecting them with the origin, multiplied by the cosine of 
the angle between the two lines. 

The lines which we drew between the origin and the points 
representing tests 4 and 14 are technically called vectors. (A vector 
is a line having a given length and direction.) It will be seen that 
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each test is represented by a vector, and that these vectors form a 
system which gives a diagrammatic representation of the relations 
obtaining between the tests, just as the matrix of intercorrelations 
gave an arithmetical representation of these relations. In either 
case we are dealing with a pattern which we are seeking to describe 
in the simplest possible way. 

Let us next plot in two factors space all 15 tests, using the 
neurotic sample values from Table IX. Figure 5 shows the 
results. The reader may care to plot the values from the normal 
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sample himself for comparison; they are very similar to, though not 
exactly identical with, those shown in Figure 5. | 

It will be noted that the upper апа the lower halves of this 
diagram are rather nicely balanced. There are more points in the 
upper half, but those in the lower half have higher saturations on 
the second factor. In fact, it would appear as if the first factor axis 
passed exactly through the centre of the whole swarm of points. 
This is indeed the case; the method of extraction of a factor 
ensures that it will as it were be a multidimensional average (a 
‘centroid’) of all the vectors involved. This feature of factor 
analysis is at the same time a strength and a weakness; just as an 
average may be meaningful or meaningless, dependent upon 
precisely what is being averaged, so the factors extracted from 
a matrix may be meaningful or not dependent upon what pre- 
cisely is being intercorrelated. Factor analysis is often considered 
as a kind of sausage machine—correlations are put in and out 
comes the psychologically meaningful and scientifically valuable 
result. This view is not shared by any responsible investigator; like 
all other statistical methods, factor analysis is far from fool-proof, 
and where psychological insight, careful planning, and an under- 
standing of the hypothetico-deductive nature of scientific methodo- 
logy are lacking, no mathematical brilliance can suffice to salvage 
the results of misapplied ingenuity. To admit that factor analysis 
has occasionally been misapplied, and that results reported by 
some of its overenthusiastic users have often been absurd, does not 
amount to an admission that critics are justified in quoting such 
examples in condemnation of factor analysis as a scientific method; 
almost any scientific tool can be misused by the uninitiated, regard- 
less of its general value and importance. Criticism must be based, 
not on occasional absurdities, but on the best available examples 
of the methods properly used and correctly interpreted according 
to its own rules. Any other evaluation is merely semantic shadow- 
boxing. 

The admission that our factors are nothing but averages is a 
very damaging one; indeed, criticism of factor analysis which is 
not merely based on misunderstanding centres largely on this 
point. The ‘g’ we extract from a matrix obeying the rules of the 
tetrad criterion is not an average in this sense; we can add or sub- 
tract tests from our battery at will, and as long as each obeys the 
tetrad criterion, we can be sure that we are always measuring the 
same, identical ‘g’. But the moment that we are dealing with a 
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matrix which does not obey the tetrad criterion, and which requires 
therefore more than one factor, we leave objectivity dangerously 
far behind, and are forced to enter into many circuitous byways 
before emerging once more into the main stream of scientific 
methodology. 

A factor, in order to be useful in ordering our descriptions of 
the multiform facts of psychology, must have two characteristics, 
It must be unique, and it must be invariant. By unique we mean that 
from a given matrix one and only one solution should be possible 
if the rules of extraction are followed. By invariant we mean that 
however much the battery of tests may be altered by adding other 
tests, or subtracting tests already included in it, nevertheless the 
factor saturations of those tests which are retained in the battery 
should not change. It may be said that rules can be laid down 
which ensure the uniqueness of the factors extracted by any of the 
current methods; it is with respect to invariance that our troubles 
start. 

Let us look again at Figure 5. Factor 1 is balanced carefully 
between the positive and the negative sides of the bipolar second 
factor, in such a way that the factor saturations of the tests having 
Positive saturations on the second factors balance out almost 
exactly the factor saturations of the tests having negative satura- 
tions. What would happen if we omitted test 11 from our calcula- 
tions? The relations between all the other tests (as indicated by 
their positions in the two-factor space) would remain quite un- 
affected, because they represent the correlations of these tests, 
which are of course not changed in any way by dropping any 
Particular test. But the position of our two axes would be changed. 
The first one would pass through the new average (centroid) of 
Our cluster of fourteen tests, and would therefore lie approximately 
as shown in Figure 5 as position A. The second factor axis, in order 
to remain at right angles to the first factor axis, would of course 
shift through a similar angle to position A . Take out another test 
DOW, say test 6, and again the position of the axes will shift, this 
time to B and B’, Take out tests 9 and 14 too, and now our axes 
Would assume positions C and С’. Clearly, these axes аге in по 
Way invariant, but shift about depending entirely on the make up 
of the battery of tests. In other words, they are almost useless in 
their present form for our purpose. 

One way out of this difficulty has been suggested by Thurstone 
(1935), and has given worthwhile results in the field of mental 
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abilities. He lays down certain mathematical rules which, in А 
suitable matrix, will give a rotation which is at once mime qna 
invariant. This final position he calls ‘simple structure’, ай е 
defines it by saying that in simple structure as many of the em ж 
possible should be at right angles to as many of the "peas E 
vectors as possible. As we have seen, ‘being at right ang es S 
means simply ‘having zero correlations with’, and this deman 
therefore means that as many of the tests should have zero satura- 
tions for as many of the factors as possible. More particularly, he 
requires that every factor axis should be at right angles to at wett 
as many test vectors as there are factors, and that every test shou 
be at right angles to at least one factor. In practice, these demands 
amount to this: (1) Each test should have at least one zero satura- 
tion. (2) Each factor should have at least as many zero saturations 
as there are factors. (3) There should be at least as many XO or 
OX entries in each pair of factors as there are factors. (An XO or 
OX entry means simply a zero saturation in one factor accom- 
panied by a non-zero saturation in another.) 

Experience has shown that these requirements are met by 
matrices of observed correlations too rarely to be of much signifi- 
cance as long as the orthogonal pattern of factor axes is retained. 
"Thurstone therefore permits his factors to be correlated with cach 
other, so that his factor axes are no longer at right angles to cach 
other. Figure 6 shows the kind of pattern which is frequently 
found in a Thurstone-type analysis. I and II are the centroid 
factors as determined from the original matrix; I and II’ are the 
rotated factors showing ‘simple structure’ and a correlation given 
by the cosine of the angle «. Thus once the factors in a matrix have 
been found and rotated, we can write another matrix of inter- 
correlations giving the correlations between factors. This in turn 
can be analysed, and in the case of tests of ability, seems to obey 
the tetrad criterion, thus giving rise to one second-order or super- 
factor which corresponds very closely to Spearman’s ‘g’. We thus 
seem to be able to recover the impressive simplicity and orderli- 
ness of Spearman’s original picture, without leaving out of 
account the additional ‘group factors’ which. made his simple 
model inapplicable. 

The advantage of this method of analysis which gives us, first 
of all, unique and invariant primary factors, and then a ‘g’ based 
on the intercorrelations of these primary factors, is obvious. Our 
‘g’ will not change with addition or omission of new tests, but will 


THE DIMENSIONAL APPROACH 65 


remain uniquely and invariantly defined. This would not be so if 
we were to extract ‘g’ from our original battery of tests, without 
first fitting them into a ‘simple structure’ pattern. Such at least 
would be the claim of those who follow Thurstone in his method. 
To a considerable degree these claims appear to be justified. 


Figure 6 


Experimental results do appear to efforts in patterns of inter- 
correlations which allow the research worker to use the methods of 
‘simple structure’, and factors thus discovered do seem to correlate 
in a hierarchical pattern which obeys the tetrad criterion. It may 
be objected that angles of rotation are not really uniquely ne 
mined, in view of the difficulties caused by sampling errors, an 

that the invariance of the resulting factors has not been demon- 
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strated experimentally under stringent conditions of univariate 
and multivariate selection. These objections do not attack the 
principle which Thurstone is attempting to establish, and there is 
good prospect that in time the mathematical problems may be 
solved, and the experimental evidence be supplied. At the very 
least, we would appear to have here a method which holds out a 
hope that through its use we may be able to solve the taxonomic 
difficulties in the cognitive field. 

Let us look at the general picture given of the organization of 
mental abilities by factor analysis. We discern four main levels. At 
the lowest level we are dealing with single observations of be- 
haviour which may be regarded as falling into the general domain 
with which we are concerned. Many thousands of such observa- 
tions are made by all of us every day, with varying degrees of 
objectivity, reliability, and validity. We attain a much higher level 
when we can measure the consistency of this behaviour, through 
reliability coefficients or other statistical means. This will usually 
be possible only with test or examination behaviour, although 
teachers' or other persons' ratings, social achievement, or repeated 
success at some intellectual task might be considered here. At this 
level we would be dealing with mental tests of known reliability, 
dealing with a very restricted universe of content—a test of speed 
in simple addition might be quoted as an example, or a test of 
ability in completing letter series of a given level of difficulty. A 
third level is reached when we find that these tests in turn correlate 
to produce primary mental abilities, such as the verbal, numerical 
memory, or visuospatial factors. The fourth and highest level of all 
is reached when these primary mental abilities in turn are found 
to correlate and give rise to a second-order factor corresponding to 
‘g or general mental ability or intelligence. A diagrammatic repre- 
sentation of these levels is given in Figure 7. Factor analysis 
enables us to give fairly exact estimates of the contribution of 
each of these levels to the total variance of a given test, and quite 
generally to give a reasonably accurate picture of the interrelations 
of all the thousands of separate facts represented even in such a 
simple diagram as the one presented here. 

What we are doing in factor analysis is simply the breaking up 
of the total variance (the total variability on a test shown by the 
sample of people tested; the variance is equal to the square of the 
standard deviation) into various component parts. Thus we may 
let the total variance equal 100 per cent, or unity in order to keep 


67 


THE DIMENSIONAL APPROACH 


S3ON3HWuno20 
Q3AuUuasdo 


--------- a a a ee ee ee ee eee ©00QQ000 00 


coe fee 9 (19 Ба [0] aalt annam Wiele! [s][r][e][z] |! sa 


TWILVdS-ONSIA ТЕТ ayaa | SAILMIGV 
GR © TVLNIW 
(8) W) | лума 


HO? 
Узачо daNoo2as 


68 THE DIMENSIONAL APPROACH 


to the order of size of correlation coefficients, and write an 
equation to account for all of the variance of test A: 


ол? = a? + as? + аз? +... au? + а, + а? =1 


In other words, the total variance of test A (042) is accounted 
for by the squares of the saturations of this test on factors 1, 2, 3, 
to n, plus the squares of the specific factor saturation (s) plus the 
error variance (e?). The various parts of this equation have 
separate names. The variance contributed by the factors from 1 to 
nis known as the ‘communality’, because they constitute what the 
test has in common with other tests; it is usually written 42. Specific 
and error variance together are known as the ‘uniqueness’ of the 
test, because these factors are not shared with any other test. The 
uniqueness (и?) is of course equal to 1 — h?, because и? and h? add 
up to 1. The reliability (тш) of a test (its empirical test reliability) 
is given by the formula r,, = 1 — a,?, from which we can deduce 
that if the reliability is known, the error variance can be calculated 
by the formula а,2 = 1 — т. These various relations allow us to 
calculate the contribution of the various factors—communality, 
specificity, error—to the total variance of any test in the battery. 
These various relations hold for orthogonal factors; with certain 
changes they can be altered to fit the oblique pattern also. 

Can this method of multiple factor analysis with rotation to 
‘simple structure’ be taken over into the field of temperament, 
character, attitudes, interests, and so forth? There are a number 
of reasons why such a transfer would be of doubtful value. In the 
first place, the factor analysts who worked in the field of menial 
abilities had concepts and tests available which had already under- 
gone a long period of development. The faculty psychologists on 
the one hand, and the early test designers on the other, had out- 
lined the general field with considerable ingenuity. Even un- 
tutored common sense might be expected to give clues to the main 
areas in which primary mental abilities could be found: in the 
field of test material—verbal, numerical, perceptual; in the field of 
method of approach speed, accuracy, and so forth. The reason for 
this relatively happy state of affairs is not far to seek; through the 
ages philosophers have been concerned with cognitive phenomena, 
with epistemological problems, and the general question of the 
possibility of knowledge. Their speculations in the end required 
much factual pruning, but they did illuminate the subject from all 
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sides, and created a background of sophistication against which 
more recent workers could proceed. 

Not so in the non-cognitive field. This has always been the 
Step-child of philosophy, psychology, and medicine, and the in- 
vestigator will not find ready to haud a large number of objective 
tests which he may use, or a set of reasonable hypotheses which he 
may investigate. Instead, he will find a subject still apparently 
enjoying its birth trauma without giving any signs of growing up, 
but filled instead with return-to-the-womb phantasies; a subject 
in which the most remarkable theories are bandied about with 
Careless abandon, and in which not even the notion that some sort 
of proof is required before hypotheses can be accepted is widely 
known or shared. Instead of objective tests, he will find the field 
dominated by ratings (in spite of their known unreliability), 
questionnaires (in spite of their known lack of validity), and 
So-called projective techniques which multiply in a horrifying 
Malthusian fashion before his very eyes (in spite of the almost 
Complete absence of proof that they measure anything whatever).' 

nder these circumstances he cannot begin—as could the student 
of ability—by taking a number of tests and intercorrelating them 
їп an effort to bring order out of chaos; the very tests which he 
May wish to use are not yet in existence, and indeed, it is not at all 
Clear which mental functions he should attempt to measure even if 
€ were ready to construct new tests. 

In the second place, even if the intrepid investigator were to 
Succeed in locating a few objective (or even not-so-objective) tests 
Which in his opinion measured temperament, he would find their 
intercorrelations woefully low, and obstinately determined not to 
fall into a pattern even remotely resembling simple structure. His 
troubles would be increased by the fact that the notion of the 
Positive manifold' (that all saturations should be zero or positive) 
makes sense in the field of abilities, where negative correlations are 

ardly ever found, but does not apply in the field of affection and 
Conation. And while the notion of the ‘positive manifold’ is not 
absolutely essential to ‘simple structure’, there is no experience to 
Suide the research worker to find simple structure in the absence 
Of a positive manifold. 

Я In the third place, there are certain practical difficulties which 


itate against wholesale testing in the field of temperament. It 
1 


les Gresham’s law would appear to apply in the field of psychological tests no 
St 


han in that of economic theory! 
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is easy to devise group tests for mental abilities, and to test many 
thousands of school children, students, and adults with them; to 
repeat these tests, or alternate versions of them, in order to obtain 
reliability coefficients; and to carry out all these procedures in the 
class-room, the factory, or any convenient large hall. Not so for the 
type of test which might be used for the measurement of tempera- 
ment. Group testing is nearly always impossible (except for the 
egregious and not very useful questionnaire); special apparatus is 
required, such as the P.G.R., the Luria, or the C.F.C. machine; 
each test takes a great deal of time—sometimes up to two or three 
hours—and can only be carried out in the laboratory, or the 
hospital, with consequent loss of time to the subject who has to 
journey thither on many separate occasions. Add to all this the 
fact that intelligence testing is ‘respectable’, and encounters little 
opposition nowadays, while tests of temperament are almost 
by definition emotion-rousing, and frequently frightening or un- 
pleasant, as well as relating to thought content not considered 
‘nice’ by many people, and it will be appreciated why methods 
which can be used in the cognitive field may not be easily trans- 
ferred to the non-cognitive field. 

Do we have an alternative to the method suggested by Thur- 
stone in our attempt to find a set of factors which corresponds to a 
set of psychologically real influences? Cattell (1946) has pointed 
out that there are two ways of arriving at such an objective. The 
investigator, he writes, ‘may (1) devise possible ways of over- 
determining the analysis of the given correlation matrix so that 
only one set of true factors will emerge, or (2) start from the 
opposite shore and propound, on psychological grounds alone, a 
hypothesis about what source traits are operative in the variables. 
Then he will see if these factors correspond to any of the possible 
mathematical factors found in the matrix. The first of these 
methods is that adopted by Thurstone. The second of these 
methods, which is clearly the hypothetico-deductive method which 
has shown itself to be so fruitful in science generally, is rejected by 
Cattell on two grounds. *. . . in the first place, personality study 
has so few other reliable avenues for arriving at, or even suspecting, 
the basis source traits, that hypotheses are likely to be erratic. In 
the second place, the mathematical solutions to any set of correla- 
tions are so numerous and varied that unless the hypothesis can be 
stated in very precise quantitative terms the “proof” of it is easy— 
so easy as to be worthless. Cattell repeats his position in slightly 
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different terms: ‘...we have rejected one of the two major 
approaches normally approved by scientific method—namely, that 
of inventing a hypothesis about the particular factors expected and 
attempting to discover a factorization to match it—because in this 
field almost any hypothesis could be so “confirmed”. Instead, we 
seek general guiding principles for the mathematical analysis itself 
which will lead to a unique solution.’ 

While I appreciate the great contribution which Thurstone has 
made to the analysis of mental abilities, and while I am fully con- 
Scious of the difficulties Cattell has pointed out in arriving at a 
hypothetico-deductive method of analysis which shall be capable 
of being refuted (i.e. one where the ‘proof’ would not be so easy 
as to be worthless), I nevertheless believe that progress in the non- 
Cognitive field depends on the discovery of such a method. Accord- 
ingly we turn next to an attempt to define the principles underlying 
such a combination of factor analysis and hypothetico-deductive 
method; the general name given to this procedure is that of 
Criterion Analysis’ (Eysenck, 1950). The rest of this chapter will 
be devoted to an exposition of this method; the rest of this book 
to an application of this method to a variety of theoretical and 
Practical problems. 

For a clue to the procedure which might be adopted here, we 
May return to an example given earlier in this chapter. It will 

© remembered that several (questionnaire) tests of ‘neuroticism’ 
Were given to groups of normal and neurotic soldiers, the intercor- 
relations for the normals and the neurotics separately calculated, 
and factor analyses on the two matrices carried out. As shown in 

Table IX, two factors emerged from each of the matrices which 
Were very provisionally identified as ‘neuroticism’ and as 'extra- 
iersion-introversion'. Let us concentrate on the first factor only 
Sé the time being; the arguments which apply to it will be shown 
© apply to subsequent factors also. Let us deal only with the factor 
€merging from the intercorrelations matrix of the neurotic group 
as plotted in diagram form on page 61 ; this is so similar to the 
normal group factor pattern that what is said of the one is also 
Applicable to the other. 

‚ Right from the beginning we seem to be involved in a contra- 
diction. We have ‘identified’ the first factor as extracted with 
neuroticism’ ; yet we have used the diagram on which we have 
Plotted this factor (Figure 5) to show that the factor axis has no 

xed Position, but changes with the addition or subtraction of 
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tests in the matrix. If the factor found is meaningless (as is implied 
in the notion that it fails to be invariant), how can it be interpreted 
at all? The answer lies in the fact, emphasized before, that a factor 
is an average; an average may be meaningful or meaningless 
according to what it is that is being averaged. If we throw together 
tests of intelligence, neuroticism, and introversion, the resulting 
average intercorrelation will be a meaningless jumble-impossible 
to interpret. If we throw together a lot of tests of neuroticism, the 
resulting average will have at least a rough-and-ready resemblance 
to meaningfulness, because what is being averaged is all measuring 
much the same underlying dimension. While therefore the axis is 
not permanently fixed, and is certainly not invariant, it will point 
roughly in the right direction—with an error of some 10 or 20 
degrees either way. Consequently, the interpretation of such a 
factor, unrotated as it is, is possible, although one’s confidence in 
this interpretation is purcly subjective. 


TABLE XI 
List of Tests 


Maudsley Medical Inventory—4o item neuroticism questionnaire. 

Score = number of questions answered ‘No’ (non-neurotic). 

Dark Adaptation—U.S. Navy Radium Plaque Adaptometer. Score 

= goodness of dark vision. 

Non-Suggestibility—body-sway test. Ability to resist suggestion to sway 

forward. 

Motor Control—absence of static ataxia; given as preliminary test to C. 

Goal Discrepancy Score—smaliness of level of aspiration scores on 

O’Connor tweezers test. 

Judgment Discrepancy Scores—smallness of judgment discrepancies on 

O’Connor tweezers test. 

Index of Flexibility—number of shifts in aspiration scores on O’Connor 

tweezers test, irrespective of size or direction. 

Manual Dexterity—best score of nine trials on tweezers test. 

Personal Tempo—speed of writing 2, 3, 4, repeatedly for two trials of 

15 seconds each. 

Fluency—number of round things and of things to eat mentioned during 

30-ѕесопа periods. 

Speed Test (1)—speed of tracing when instructed to be both quick and 

accurate. (Choice conditions.) 

Speed Test (2)—speed of tracing prescribed path on track tracer under 

instruction to be quick. 

Persistence Test I—length of time during which leg is held in uncomfort- 

able and fatiguing position. d В è 

Persistence Test B—holding breath as long as possible, without inhaling 

or exhaling. Р 

О. Stress Tést—ability of S to recover previous scoring rate on pursuitmeter 
type of test after special stress period. s ; 

P. Non-Perseveration—extremes of perseveration (SZ test), either very high or 

very low, are scored low, while scores nearer the average are scored high. 
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However, it will be remembered that we did go a little further 
than this and provided a criterion to indicate that this interpreta- 
tion was at least partly justified. We calculated the ‘criterion 
column’, i.e. the column of correlations of each test with the 
normal-neurotic dichotomy, and showed that our factor satura- 
tions correlated to the extent of ‘81 with this ‘criterion column’. 
This is objective evidence in favour of our hypothesis, and repre- 
sents a genuine application of the hypothetico-deductive method, 
because we had shown previously that on the basis of our main 
hypothesis regarding the existence of a neuroticism continuum, and 
on the basis of subsidiary hypotheses regarding the differential 
success of the various tests used in measuring this continuum, such 
ч correlation could be predicted. This still leaves our factor rather 
wobbly’ and far from invariant, but it does suggest a further step 
which will tie it down completely to a position which is both 
unique and invariant. That position is one in which the correlation 
between the column of factor saturations and the criterion column 
is a maximum. In other words, the factor axis is to be rotated in 
Such a way that factor saturations correlate as highly as possible 
With the criterion column. 

Let us give an example of this procedure by reference to an 
experiment using objective tests rather than questionnaires. The 
tests enumerated in Table XI were given to 93 normals and 105 
Neurotic subjects, and the correlations calculated between each 
test and the normal-neurotic dichotomy. These correlations are 
given in Table XII under the heading C, (Criterion Column— 
Neuroticism). Product moment intercorrelations were worked out 
between the tests for 64 of the normal subjects who had carried out 
all of the tests, and the matrix of intercorrelations factor analysed. 
Two factors were extracted, and are given in Table XII under the 

cading F, and F,; the heading 4? indicates in each case what 
Portion of the total variance of each test is accounted for by these 
two factors. Rotation of the two factor axes through an angle of 

Ve degrees makes the correlation between first factor saturations 
and criterion column a maximum (r = 574), and, gives us the 
Values for the rotated first factor given in column D. A graphic 

€monstration of the results is given in Figure 8; axes I and П are 

€ centroid axes as found; D is the rotated first factor axis. To 
show what would happen if an attempt were made to approach 
Simple structure, axes I’ and II’ have been drawn in; they will be 
“een to be almost completely uninterpretable. 
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TABLE XII 

б, Е, F, h? b D, 
A 23 143 *211 +065 080 u27 
B 27 `392 —-220 -202 1256 407 
с 51 -620 —-416 "557 409 “650 
D 54 644 —'428 ‘607 "435 „| 675 
E Ki +100 -089 018 +059 094 
F по `497 —`455 454 '833 '529 
G "05 "275 —*497 233 “Ig! "303 
H "57 405 — :078 170 +258 "410 
I '30 438 175 ‘222 "267 424 
J 03 "300 018 "090 -188 *299 
K 17 +523 "100 284 +324, *515 
L 17] +565 461 "532 "333 "529 
M 46 :бо7 :384 7516 :363 |. :576 
N 26 1632 430 '584 377 `599 
[9] 24 294 —103 097 “189 300 
P "21 207 "241 пот 119 189 

| 203 *093 :296 


Explanation of Column Headings: 


Cy = Criterion Column, i.e. correlation of each test with normal-neurotic 
dichotomy. 


F, and F, = First and second unrotated factors from analysis of intercorrela- 
tions of normal group only. 
h? = Communality. 


D= F, rotated into maximum correlation with criterion column. 
?, = D with vector extended to unity 


To D, = 7574 
Po б = 587 


Several points should be noted in this example. The angle 
required to get maximum correlation between F, and C, is very 
small; as all the tests were chosen as measures of neuroticism, their 
average (the first centroid factor) is therefore meaningful and close 
to the optimum position. This optimum position is unique and 
invariant; the addition of new tests, or the subtraction of old ones, 
would not alter the position of D. The relatively high, positive 
correlation between C, and F, is in line with our hypothesis that 
the factor is one of neuroticism and supports the view that 
neuroticism is a continuum. The grouping ofthe tests on the second 
factor is in conformity with what in previous investigations had 
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been shown to be characteristic of the introvert-extravert dicho- 
tomy. Thus introverts have been shown to be more persistent, and 
extraverts to show less judgment discrepancy, and somewhat less 
suggestibility, as well as better dark-adaptation. None of these 
points will be stressed here as the study mentioned was only a 
preliminary one, using too few subjects to be convincing, and 
employing a selection of tests which was much improved in later 
work. It is given here only for the purpose of illustrating the 
method under discussion. 

At this point the critical reader may feel that we have jumped 
from the frying-pan into the fire. We have managed to get rid of 
the subjective element involved in locating our factor axes, only to 
link up this rotation with the even more subjective criterion of 
psychiatric diagnosis. If our final aim were merely to obtain 
maximum correlation with the criterion, i.e. with the diagnosis 
‘normal’ or ‘neurotic’, then some form of multiple correlation or 
discriminant function analysis would be indicated. Clearly, a 
further step is required. 

Let us look more closely at our criterion. ‘Neurotics’ are people 
who have been thus diagnosed by a qualified psychiatrist, after 
full consideration of their past work history, illness record, sexual 
behaviour, phantasy life, and social adaptation, as well as their 
current symptomatology. ‘Normals’ are the undifferentiated mass 
of people who have not at any time of their lives been so diagnosed, 
and who at the time of the investigation are leading relatively 
normal lives and are not experiencing symptoms severe enough to 
lead them to seek help from a psychiatrist. The ‘neurotic’ group 
itself is of course not entirely homogeneous, but subdivided in 
various ways. Some are bright, others are dull; some are extra- 
verted (hysterics), others are introverted (dysthymics); some show 
additional psychotic symptoms, others do not. There is much 
agreement among experts that certain groups fall into the general 
field of ‘neurosis’, e.g. hysterics, anxiety states, reactive depres- 
sions; there is no agreement at all regarding other groups, such as 
the psychopathic states, or the obsessional and compulsive groups. 
Some people would argue that ‘psychotics’ canhot be differentiated 
from ‘neurotics’, and should therefore form part of the criterion 
group; others would claim that psychotics and neurotics are 
entirely different disease forms, in no way resembling each other. 
Clearly, our criterion is in a very doubtful state, and itself in need 
of clarification. This can be applied by an inverse application of 
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our principle of maximizing the correlation between actor (Fj) 
and criterion (C,), which leads us to a double maximization principle. 

It is well known that the correlation between a test and a 
criterion can be increased in two ways: by improving the test, and 
by improving the criterion. Similarly, we can increase the correla- 
tion between factor and criterion column in two ways: by improv- 


ing our factor measurements, and by improving our criterion. So 
far, we have considered examples in which the criterion group 
consisted of neurotics suffering from those disorders and symptoms 
most commonly agreed to define neurosis: hysterical conversion 
symptoms; somatic correlates of anxiety; feelings of depression, 
Worry, and fears unreasonably severe in relation to their causes; 
amnesias and other disorders of memory not caused by any obvious 
physical agent or injury. This criterion may be only a relatively 
rough approach to a correct and perfect criterion, and this lack of 
Perfection in the criterion may be reflected in the failure of the 
Correlation between F, and Cy to reach perfection. If that be so, 
then clearly any improvement in the criterion would result in an 
increase in the correlation between F, (always rotated into a 
Position of maximum correlation with Cy) and C,, determined 
Now by the new criterion chosen. 

As an example, let us suppose that the large group of psycho- 
Paths bclongs functionally into the neurotic field, as is maintained 
by many psychiatrists. Then if we changed our group of neurotics, 
85 used in the experiment summarized in Table XI, to include 
large percentage of psychopaths, we would presumably obtain 
different scores on all our tests for this new, enlarged ‘neurotic’ 
Broup, and consequently a changed column of correlations between 
tests and normal-neurotic dichotomy (C’,)- If the hypothesis that 
Psychopaths did belong to the neurotic group were correct, we 
Would expect the correlation between F^; (rotated into maximum 
Correlation with the new С”,) and the new C’, to be as large or 
larger than the correlation of F, had been with the old Cy. A drop 
11 this correlation would indicate very clearly that our criterion 

ad not improved, but had become attenuated by the inclusion of 
this ‘foreign body’, and that consequently ‘psychopathy’ did not 
з а part of the neurotic criterion. The same procedure could of 
ge be gone through with other groups—obsessionals, epileptics, 
m E forth, in an attempt to obtain the maximum possible 
crit elation between criterion and factor. Given that the proper 
crion could be approached in this manner, à correlation of 
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near unity should be capable of achievement—as will be shown 
later, correlations of oo and -95 have already been reached be- 
tween factor saturations and criterion columns. 

The essential points in ‘criterion analysis’ will now have become 
clear. We start with a given criterion for some hypothetical con- 
tinuum which we suspect, on purely psychological grounds, to be 
of general interest and importance. This criterion will jn all prob- 
ability be very impure, attenuated, and inaccurate. We create 
measuring instruments which can be shown to give some rough 
measure of correspondence with the criterion; we then study the 
interrelations of these measuring instruments by means of factor 
analysis, to obtain evidence of the existence of our hypothetical 
continuum, and to discover whether it is related in some rough 
way to our criterion. As has been shown early in this chapter, we 
may in this way actually transcend our criterion (the example of 
the teachers' ratings of intelligence as opposed to the intelligence 
test will be remembered). We may therefore now use our factor to 
improve our criterion, until we reach a position of exact equival- 
ence. When this is done, we may assume that our original hypo- 
thesis was justified. This process may appear like arguing in a 
circle, but in reality this method of argument corresponds precisely 
to the method used by physical science to define even such 
elementary concepts as ‘length’. We may with advantage quote 
Bertrand Russell’s (1948) discussion of physical measurement at 
this point. 

“Measurement, even of the distance to remove nebula, is built 
up from measurements of distances on the surface of the earth, and 
terrestrial measurements start with the assumption that certain 
bodies may be regarded as approximately rigid. If you measure 
your room, you assume that your foot-rule is not growing appreci- 
ably longer or shorter during the process. The ordnance survey of 
England determines most distances by triangulation, but this pro- 
cess demands that there shall be at least one distance which is 
measured directly. In fact, a base line on Salisbury Plain was 
chosen, and was measured carefully in the elementary way in 
which we measure the size of our room: a chain, which we may 
take as by definition of unit length, was repeatedly applied to the 
surface of the earth along a line as nearly straight as possible. This 
one length having been determined directly, the rest proceeds by 
the measurement of angles and by calculation; the diameter of the 
earth, the distance of the sun and moon, and even the distances of 
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the nearer fixed stars, can be determined without any further 
measurement of lengths. 

‘But when this process is scrutinized it is found to be full of 
difficulties. The assumption that a body is “rigid” has no clear 
meaning until we have already established a metric enabling us to 
compare lengths and angles at one time with lengths and angles at 
another, for a “rigid” body is one which does not alter its shape or 
size. Then again we need a definition of a “straight line”, for all 
Our results will be wrong if the base line on Salisbury Plain and the 
lines used in triangulation are not straight. It seems, therefore, 
that measurement presupposes geometry (to enable us to define 

Straight lines") and enough physics to give grounds for regarding 
Some bodies as approximately rigid, and for comparing distances 
at one time with distances at another. The difficulties involved are 
formidable, but are concealed by assumptions taken over from 
Common sense... . Common sense assumes, roughly speaking, 
that a body is rigid if it looks rigid. Eels do not look rigid, but steel 

ars do... . Common sense, in so thinking, is Newtonian: it is 
Convinced that at each moment a body intrinsically has a certain 
Shape and size, which either are or are not the same as its shape 
and size at another moment. Given absolute space, this conviction 

as meaning, but without absolute space it is prima facie meaning- 

ess... 

‘As in the case of the measurement of time, three factors enter 
in: first, an assumption liable to correction; second, physical laws 
Which, on this assumption, are found to be approximately true; 

ird, a modification of the assumption to make the physical laws 
ioe nearly exact. If you assume that a certain steel rod, which 
"s $ and feels rigid, preserves its length unchanged, you will find 
at the distance from London to Edinburgh, the diameter of the 
Pg and the distance from Sirius, are all nearly constant, but are 
eh less in warm weather than in cold. It will then occur to 
You that it will be simpler to say that your steel rod expands with 
ae particularly when you find that this enables you to regard the 
= distances as almost exactly constant, and, further, that you 
D see the mercury in the thermometer taking up more space in 
varm Weather. You therefore assume that apparently rigid bodies 
y ad With heat, and you do this in order to simplify the state- 
ent of physical laws. 
€t us get clear as to what is conventional and what is physical 


fact ; Р : 
act in this process. It is a physical fact that if two steel rods, 
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neither of which feels either hot or cold, look as if they were of the 
same length, and if, then, you heat one by the fire and put the 
other in snow, when you first compare them again the one that has 
been by the fire looks slightly longer than the one that has been in 
the snow, but when both have again reached the temperature of 
your room this difference will have vanished. I am here assuming 
pre-scientific methods of estimating temperature: a hot body is one 
that feels hot, and a cold body is one that feels cold. As a result of 
such rough prescientific observations we decide that the thermo- 
meter gives an exact measure of something which is measured 
approximately by our feelings of heat and cold; we can, then, as 
physicists, ignore these feelings and concentrate on the thermo- 
meter. It is then a tautology that my thermometer rises with an 
increase of temperature, but it is a substantial fact that all other 
thermometers likewise do so. This fact states a similarity between 
the behaviour of my thermometer and that of other bodies. 

‘But the element of convention is not quite as I have just stated 
it. I do not assume that my thermometer is right by definition; on 
the contrary, it is universally agreed that every actual thermometer 
is more or less inaccurate. The ideal thermometer, to which actual 
thermometers only approximate, is one which, if taken as accurate, 
makes the general law of the expansion of bodies with rising tem- 
perature as exactly true as possible. It is an empirical fact that, by 
Observing certain rules in making thermometers, we can make 
them approximate more and more closely to the ideal thermometer, 
and it is this fact which justifies the conception of temperature as à 
quantity having, for a given body at a given time, some exact value 
which is likely to be slightly different from that shown by any 
actual thermometer. 

‘This process is the same in all physical measurements. Rough 
measurements lead to an approximate law; changes in the measur- 
ing instruments (subject to the rule that all instruments for measur- 
ing the same quantity must give as nearly as possible the same 
results) are found capable of making the law more nearly exact. 
The best instrument is held to be the one that makes the law most 
nearly exact, and it is assumed that an ideal instrument woul 
make the law quite exact. 

"This statement, though it may seem complicated, is still not 
complicated enough. There is seldom only one law involved, and 
very often the law itself is only approximate. Measurements © 
different quantities are interdependent, as we have just seen in the 
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case of length and temperature, so that a change in the way of 
measuring one quantity may alter the measure of another. Laws, 
conventions, and observations are almost inextricably intertwined 
in the actual procedure of science. The result of an observation is 
usually stated in a form which assumes certain laws and certain 
conventions; if the results contradict the network of laws and con- 
ventions hitherto assumed, there may be considerable liberty of 
choice as to which should be modified." 

The close similarity between this mutual interplay of rough- 
and-ready, common-sense observation, measurement based upon 
such observation, and modification to give more precise inter- 
relations of measurements, with the procedure of ‘criterion analysis’ 
will be obvious. Criterion analysis attempts to do in a formal 
manner in psychology what has always been done in the physical 
Sciences. Our original hypotheses about temperature are derived 
from *clinical observation of our own feelings of hot and cold; 
this forms our original criterion. There is no doubt that this 
criterion is inaccurate, fallible, unreliable, and impure. One 
person's judgments of ‘hot’ and ‘cold’ are not necessarily identical 
with another’s, and may deviate considerably; external factors 
such as humidity play an important part in our judgment, as do 
internal factors such as our state of nutrition, or the amount of 
exercise we have taken. But roughly these feelings of ‘hot’ and ‘cold’ 
can be shown te correspond with certain physical phenomena, 
Such as the contraction and expansion of metals, the pressure of 
Bases, or the emission of electrons from a heated surface. As these 
Physical phenomena constitute an interconnected system of obser- 
Vations which in turn fits in well with other observations (rigidity 
of bodies, measurement of distances) the criterion is revised to 
Obtain the best possible fit (the highest possible correlation) be- 
tween all these sets of phenomena.’ 


1 It should not be imagined that physical measurement of even such simple 
and well-understood variables as heat and length is without difficulties and 
pitfalls analogous with those we experience in psychology. Thus in psycho- 
ogical testing we often encounter the phenomenon of "threshold and ceiling 
effects, i.e, of tests which measure effectively over a certain range, but either 
fail to measure the variable in question at all outside this range, or else give 

lassed or meaningless results. In physics, we encounter the same difficulty. 
hus temperature is defined in terms of the resistance of a platinum wire from 
— 183° C. to 660° С.; from 660° С. to 1063? C. it is defined and measured by 
the electro-magnetic force of a standard thermocouple. At even higher tempera- 
tures, definition and measurement are in terms of the intensity of radiation of a 
S.S. p. a 
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In the same way, ‘clinical’ observations of certain types of 
behaviour gives us our original hypothesis regarding the existence 
of a continuum of ‘neuroticism’; these observations, regularized 
and formalized in legal and psychiatric practice, form our original 
criterion. There is no doubt that this criterion is inaccurate, fallible, 
unreliable, and impure. One person’s judgments of ‘neurotic’ and 
‘normal’ are not necessarily identical with another’s, and may 
deviate considerably; external factors such as intelligence, or 
beauty, play an important part in our judgment, as do internal 
factors such as the state of our liver, or our general attitudes. But 
roughly these assessments of ‘neurotic’ and ‘normal’ can be shown 
to correspond with certain objective tests, such as body-sway 
suggestibility, or persistence. As these objective tests constitute an 
interconnected system of observations which in turn fits in well 
with other observations (employability of mental defectives, effects 
of prefrontal leucotomy, selection of nurses and students, mono- 
zygotic and dyzygotic twin differences) the criterion is revised to 
obtain the best possible fit (the maximum correlation) between all 
these sets of phenomena. Empirical evidence for these statements 
will be given in later chapters; here we are concerned merely with 
the general theory underlying scientific methodology. 

The method of criterion analysis will become clearer as its 
various applications are studied as they occur throughout this 
book. In essence, it is nothing but the application to the special 
taxonomic problems of human behaviour in its non-cognitive 
aspects of the general principles of scientific method, more particu- 
larly of the hypothetico-deductive method. It requires a main 
hypothesis regarding the existence of a quantitative continuum 
underlying a given area of human behaviour; it also requires sub- 
sidiary hypotheses regarding the nature of this continuum, which 
will determine the tests to be used, and the linearity or otherwise 
of the relations anticipated. . . . Deductions made from such a set 
of hypotheses can be disproved (an example of this will be given 


defined wave-length from a ‘black body’. As Scott-Blair points out, ‘particular 
interest lies in the fact that these conventions result in serious ambiguity, since 
it so happens that pure aluminium freezes (“becomes solid”) at just over 660° С. 
using the resistance thermometer, thus falling within the official range of the 
thermocouple, whereas when the thermocouple is used, the freezing tempera- 
ture is just below 660°, implying that the resistance thermometer is required! 
Until a suitable adjustment was made, it was therefore impossible to quote ап 
accurate freezing point for aluminium’ (Scott-Blair, 1950). 
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later in the book), or they can be supported with varying degrees 
of evidence by the results of specially planned experiments. We 
must now turn to the empirical evidence, to test the value of our 


procedures by application to a variety of hypotheses. 


Chapter Three 


THE NEUROTIC DIMENSION: OPERATIONAL 
DEFINITION 


hypothesis of the existence of a general factor of neuroticism or 

stability. These terms, which are intended to denote the extreme 
points of what is conceived of as a continuum, will be used inter- 
changeably according to the particular direction implied in any 
particular case, just as one speaks of intelligence and mental 
deficiency depending on which of the two extreme points of the 
cognitive continuum one may be concerned with. A diagrammatic 
statement of the hypothesis is given on page 52; it ‘suggests 
immediately the necessity for a formal disproof of what we may 
call the null hypothesis, i.c. the hypothesis that our two extreme 
groups (normals and neurotics) are not in fact objectively dis- 
criminated at all, but are merely chance and random selections 
from a homogeneous population. To many readers, disproof of the 
null hypothesis will appear merely pedantic and completely un- 
necessary, because this hypothesis goes so much counter to common 
sense and current psychiatric teaching; however, as we noted in 
our discussion of the evidence available with respect to the effects 
of psychiatric treatment, current beliefs are not always accurate 
guides to scientific knowledge, and a rigid proof must not be built 
on the shifting sands of hearsay and assumption. 

In citing evidence against the null hypothesis, we have grouped 
the studies quoted into four main divisions: Studies dealing with 
psychiatric ratings, studies dealing with questionnaires and inven- 
tories, studies dealing with objective behaviour tests, and studies 
dealing with constitutional differences. In reviewing this evidence, 
we will at the same time have an opportunity of discussing hyp? 
theses regarding the traits which may be said to correlate with 
the neuroticism continuum, and to characterize the neurotic, 
unstable end as opposed to the normal, stable end. We will also be 
able to review some of the evidence which suggests that neither 
ratings nor self-ratings can be relied on to give us the reliable, 

84 


Г this chapter we will begin to discuss the formal proof for our 
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valid, and objective type of evidence which we are looking for. 
No attempt has been made to give a complete survey of the litera- 
ture. We have attempted rather to pick out important landmarks 
relevant to our main hypothesis, and characterized by proper 
Tesearch design, adequacy of statistical treatment, and use of a 
sufficiently large number of cases. The principle of selection was 
that one convincing demonstration is superior to a discussion ofa 
hundred papers all of them deficient from one point of view or 
another; in cases of doubt, preference has been given to material 
otherwise not easily accessible. 


(1) Ratings and Psychiatric Diagnoses | 
The first of the studies chosen to illustrate this „section was 
carried out by Reyburn and Raath (1950), and involved ns 
rating by 83 observers of two subjects each. The total experimenta 
Population consisted of 160 ratees, evenly balanced with respect 
to sex and University education. Inter-rater reliability ona fw 
subjects who were rated by two observers was :806. Ratings were 
Оп a five-point scale, covering 45 well-defined personality баш; 
the actual ratings were gone over in each case with the rater ру 
Опе of the experimenters to ensure proper understanding of the 
Categories used. 
The table of intercorrelations was subjected to several шын 
of analysis; of particular interest here is the oblique solution w s 
Tesulted in six factors which were not independent of each other, 
and the correlations between which clearly gave rise to higher- 
Order factors, "These were not derived by the writers, but were 
Calculated from their Table V for inclusion in this book. Thirteen 
Kerations were required before the communalities began to ST 
Verge. The results are very clear-cut, and are set out m Table n 
elow. It will be seen that the first factor, which was rotated to 


TABLE XIH 


LUTIONS 
FACTOR SATURATIONS FOR ORIGINAL AND ROTATED SO! 


п” 


x Spontaneity 
Stability 
€rsistence 

Sertiveness 
ensitivity 
nferiority 


орою 
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pass through ‘stability’, has a saturation of -972 for that item; this 
identifies it as the opposite end of the ‘neuroticism’ factor with 
which we are concerned. It is reasonable to find that persistence is 
closely associated with stability, and that assertiveness also has a 
positive projection on this factor. Inferiority feeling and sensitivity 
are understandably loaded negatively with stability, as is also the 
factor ‘spontaneity’. 

The second factor opposes ‘assertiveness’ and ‘spontaneity’ to 
‘inferiority’ and ‘sensitivity’; this falls in with our conception of 
introversion and extraversion respectively. We thus get a very 
interesting confirmation here for our view of ‘neuroticism’ and 
‘introversion-extraversion’ as second-order factors in the orectic 
sphere, corresponding to Thurstone’s intellectual second-order 


factor in the cognitive sphere. Figure 9 gives a diagrammatic 
representation of the results. 


ASSERTIVENES; 


SPONTANEITY 


STABILITY! 
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The other main research to be discussed under this heading is 
one reported in three papers by Mayer-Gross et al. (1949), Slater 
(1947), and Rao et al. (1949). A total of 201 neurotic and 55 
normal officers were rated by a psychiatrist on thirteen main 
behaviour ‘pointers’, which are defined at some length in the 
original publication. Ratings were confined to noting either the 
Presence or the absence of the particular trait to be rated. A brief 
Statement of the traits used is given in Table XIV, together with 
the proportion of cases showing any of the traits in either the 
normal or the neurotic group. All the differences observed were 


See, TABLE XIV a 
р, li 2799 Percentage Incidence D Factor * 
Я ОРИ ы аа ! 
rsonality ‘pointer’ rated Karete Namal Loadings 
1. Heredity PT 13 76 2:13 
2. Physical ill-health 27 оо 54 | 211 
3. Neurotic traits in childhood “45 16 82 | 462 
4- Former psychiatric illness 37 | 5 | 7 | 345 
& Shy, solitary, etc. in childhood 40 чї 74 3:98 
7. ifficulty in making social contacts | ·33 Ku 62 ER 
3; Emotional instability 73 op 141 442 
` Obsessional features "37 13 67 174 
^ Pprehensiveness “59 -02 117 5°25 
э Dependence 53 04 105 4°61 
12, Unstable work record "23 197 S 281 
* Marriage or sey 4 i . "05 5 2 
13. Alcoholism, sexual difficulties E iod 42 57 
E o Be 


"a Values of p’ iVin obtained directly from a matrix of 2 x 2 tables on the 
™Ption of a general factor and two group factors. 


комп to be significant by means of a chi squared analysis. Also 
oa 1s a column headed ‘D’, showing the absolute mioa 
Ne Occurrence of each ‘pointer’ between the normal and the 
: Urotic groups; this column will be referred to as the difference 
Olumn’, ? 
ite A table showing the frequency of concomitance = Geer? 
Vena A Prepared, and factor analysed directly, i.e. wit eee 
cedure into correlations first, as would be the mae h ie P ei 
Wo ed. ‘his analysis resulted in one prominent general fac E a 
lata tir unimportant subsidiary factors. The general vdd 
Cavin ns correlated + -79 with the ‘difference column’, te 
"neu des tle doubt about the identification of this factor as on 
Toticism’, or ‘constitutional adequacy’, as the writers prefer to 
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call it. The other factors are identified as ‘episodic instability’ and 
‘shyness’ respectively. Scores for these three factors were derived 
very simply by noting the number of pointers of each class for each 
individual, thus obtaining three scores (A, B, and C) for each 
individual. These three variables can be considered as defining a 
space of three dimensions so that any particular individual can be 
represented by a point in such a space. Groups of individuals, as for 
instance the clinically-diagnosed syndromes of obsession, hysteria, 
anxiety state, or post-traumatic personality change, can then be 
represented by a cluster of points around the mean value of that 
cluster. 

As the neurotic members of the sample used by the investigators 
had been diagnosed in these various categories, it was possible to 
test the hypothesis that the mean values of the groups in question 
were collinear in the three-dimensional test space; in other words, 
that differences between these syndromes could be accounted for 
entirely by differences in severity of neurosis, without any other 
principle of differentiation. The appropriate tests of significance 
indeed show that this is the case. However, it appeared of some 
interest to calculate the first two canonical variates, and plot the 
position of the various groups in the two-dimensional space thus 
generated. The resulting picture is shown in Figure то and the 
figures are given in Table XV. Much the greater part of the varia- 
tion occurring is obviously taken up by the ‘severity of neurosis’ 
component, д1, ranking the groups in order from normal to obses- 
sional and psychopathic. However, the second dimension, А» 
although not significant, suggests a grouping closely in accord with 
the hypothesis outlined in Dimensions of Personality, namely, that 
additional to the dimension of neuroticism, and orthogonal to it, 
we have another dimension, that of extraversion-introversion, 
which finds its prototype in the neurotic population in the hysteric- 
psychopathic (extraverted) and the anxious-obsessional (intro- 
verted) type of personality. It will be seen that the main division 
suggested by the A, variable is precisely between the hysteric- 
psychopathic group on the one hand, and the anxious-obsessional 
group on the other. The lack of significance is perhaps explicable 
in terms of the small number of cases involved—there were only 
17 cases in the obsessional group, for instance, and only 5 in the 
‘personality change’ group. Alternatively, the suggestion may be 
advanced that through condensation of the original 13 scores into 
3 only, too much information was lost to allow existing differences 
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to appear clearly. Or again, the original choice of variables, made 
in order to maximize normal-neurotic differences, may not have 
been adequate for bringing out differences within the neurotic 
group itself. That one of these explanations is likely to be true is 
indicated by our success in differentiating between introverted and 
extraverted neurotic groups by means of psychological tests. 


Lo 
OBSESSION 
PERSONALITY e 
© CHANGE ANXIETY 
чаза, Gi "8b 
YSTERIA 
B : S A à i 
го 20 ЗО psycHOPATHY 
Figure 10.—Configuration of Diagnostic Groups 
TABLE XV 
MEAN VALUES OF FIRST TWO CANONICAL VARIATES 
Group | ^ n 
Normal ER 16 
Personality Change ‘79 36 
Anxiety State 2:22 "2I 
Hysteria 12:32 UI 
Psychopathy 3°13 =a 
Obsessional 3°36 62 


nie ratings were made post hoc, as it were, and dealt with 
h jects who had already broken down; consequently they can 
ardly escape the criticism that they do not furnish us with evidence 
wi Predictive validity. Unless a rating can be made before breakdown 
hee will identify the potential neurotic the value of the method 
Must remain in doubt. Evidence on this point is given by two 
Important studies by W. А. Hunt d al. (1949), using what they 


Call the ‘historico-experimental’ method. Their experiment was 


designed to test the hypothesis that Naval neuro-psychiatric 
the subsequent rate of neuro- 


e Een efficacious in lowering ‹ 

i EAM attrition during service, and was based on the premiss 

em the more neuropsychiatrically unfit individuals who are 
Oved from a sample of recruits during the recruit training 
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period, the smaller should be the number of medical surveys for 
neuropsychiatric reasons among that sample during their later 
military service . . . if neuropsychiatric screening performed its 
assumed function, an inverse relation should exist between the 
recruit screening rate at the Naval Training Station level, and the 
subsequent rate of neuropsychiatric attrition during service’. 
Three training stations were selected in such a way that recruit 
populations and professional psychiatric competence of interview- 
ing staff were roughly comparable; the major difference between 
the stations lay in the differential discharge rate for neuropsy- 
chiatric reasons, based on different attitudes towards psychiatry 
held by the Commanding Officers of these stations. The results of 
this study are clearly shown in Table XVI; it will be seen that in 
the stations which tended to act upon the psychiatrist’s recom- 


TABLE XVI 
Percentage Percentage Discharged 
: Percenta; » p 
Training Station N oe Discharged Subsequently by Years 
e: Subsequently | ——————————————— 
A. | 1943 | 1o44 | 1945 
Great Lakes 1525 45 т: 4 
| 5 ` 7 7 
Newport 1173 26 r8 1 2 9 
Sampson 2823 3 зо 6 | ro r5 
| 
Total: 5521 | 2:9 2:4 5 | 1 | I 
кз E 1509 | 


mendation, subsequent attrition rate for neuropsychiatric reasons 
was only about half of that shown by the station which tended to 
reject psychiatric advice. It will also be seen from the figures that 
there appears some evidence of a law of diminishing returns in 
neuropsychiatric screening; the subsequent attrition rates of ‘Great 
Lakes’ and ‘Newport’ stations are very similar, although in one of 
them only half as many recruits were discharged during training 
as in the other, due to the fact that the Commanding Officer 
attempted to set some upper limit to the number of screening dis- 
charges. These results were confirmed in another paper by the 
same authors, using further large groups of recruits, and since the 
sample involved in these studies is now over 17,000 cases, and since 
the controls are probably as satisfactory as can be hoped for with 
such large scale research carried out in wartime. conditions, ‘the 
authors feel that the use of neuropsychiatric screening by the Navy 
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in World War II has been validated, and that the rational argu- 
ment for neuropsychiatric screening has been confirmed by experi- 
mental findings’. With this statement the present writer is in full 
agreement; the importance of these studies in proving the possi- 
bility of identifying the potential neurotic before he is exposed to 
the stress that leads to later breakdown can hardly be exaggerated. 

However, while it can thus be shown that psychiatric rating 
procedures have a certain amount of validity, complacency is 
hardly in order when we look at certain results published by the 
Information and Education Division in the U.S. War Department 
(Stouffer, 1949). These results have a bearing on the question of 
reliability rather tham on the question of validity, except that 
validity, in the long run, cannot outstrip reliability; as far as they 
go, however, they show a picture of unreliability which must cause 
One to pause before accepting ratings as conclusive evidence in 
relation to the type of problem we are discussing here. A brief 
discussion of some of the data on which this judgment is based will 
now be given. | 

During August, 1945, 42:3 DÉI cent of all literate preinduction 
€Xaminees were rejected, 14 per cent for psychiatric reasons: 5:6 
Per cent were rejected with the diagnosis of *psychoneurotic", 3:9 
Per cent with the diagnosis of ‘psychopathic’, -3 per cent with the 
diagnosis of ‘psychotic’, and 4:2 per cent for other psychiatric 
Teasons. When rejections are analysed by induction station (there 
were fifty-five of these) the percentage of psychiatric rejections 1s 
found to be very unstable, ranging from 3 and 3:8 per cent at the 
One end to 50:6 per cent at the other! In other words, for every one 
Inductee rejected for psychiatric reasons in one centre, there are 
Over one hundred rejected in another. The plausible argument 
that perhaps the figures represent a faithful picture of the actual 
Incidence among the populations drawn into these various induc- 
tion stations is rejected by the writers of the report, who give 
convincing reasons why this is most unlikely. We are left with the 
unreliability of psychiatric assessment аз the most probable cause. 
Among rejects, there are equally large differences between 


stations with respect to the different diagnoses. With an average of 


39:9 per cent being diagnosed as sychoneurotic among all the 
j podus dim р cent for different 


Tejects, the percentage varies from 2:7 to 90°2 Per ‹ 
Stations! Percentage of psychotic diagnoses varies from 26:8 to 
`O per cent. Percentage of psychopathic diagnoses varies from 70:9 
to 2:0 per cent. Such differences again must be put down to 
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unreliability of psychiatric classification, and while it is true that 
the psychiatrist was not given much time in these stations for 
arriving at a diagnosis, this handicap should not affect the relative 
incidence of different forms of disorder. 1 

It might be objected that in the study just quoted, psychiatrists 
were working under a disadvantage, due to being under heavy 
pressure and not having sufficient time to arrive at an accurate 
diagnosis, as well as to being forced to use diagnoses which might 
not be in line with their special training or with their theoretical 
outlook. We may therefore quote briefly a study of psychiatric 
reliability in which the dice were, as it were, loaded in favour of 
the psychiatrist (Air Ministry, 1947). A number of air crew who 
had been referred for examination by a psychiatrist, had in actual 
fact been seen by two psychiatrists at intervals of a few weeks; both 
psychiatrists made a report on the case which included a three- 
point rating on his neuroticism (‘predisposition to neurosis’). 
Unfortunately, the two opinions were not independent; the second 
psychiatrist had available to him some part of the opinion of the 
first psychiatrist to study the case, thus possibly increasing the 
probability of agreement. Also, the variance of the trait under 
consideration (neuroticism) is likely to have been much larger in 
this sample than in an unselected group—some 25 per cent were 
rated as having had severe predisposition, a rating very much less 
frequently encountered in unselected populations. This extension 
of range again would tend to increase the inter-observer reliability. 
Thirdly, judgment here is ex post facto; the psychiatrist is not asked 
to make a forecast as to what a person may do under unspecified 
conditions (as is the case in the American work quoted above). He 
is asked to rate the person in the light of his knowledge of that 
person’s behaviour in a specified situation or set of situations, 
manifestly a much easier task. On the other hand, presumably 
these 541 people were seen twice because they presented certain 
difficulties in diagnosis, a fact which would tend to lower the inter- 
observer reliability. On the whole, probably these various con- 
siderations almost balance out, making the final figure for the 
reliability of psychiatric ratings a slight overestimate. 

Yet the overall results, in spite of these relatively favourable 


1 It should not be supposed that medical diagnoses in fields other than the 
psychiatric are invariably characterized by a high degree of reliability; the 


papers by Franzen & Brimhall (1942a, 19425) should be consulted in this 
connection. 
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conditions, are rather disappointing. Total misclassification (dis- 
agreements of one observer with the other) is 31 per cent. The 
Coefficient of contingency is -459; when this is corrected for the 
small number of categories in order to make it compatable with a 
Product moment correlation, the value only rises to :534. This 
value is far below anything that would be considered acceptable 
in connection with a psychological test, and when in other parts 
of this book correlations are run between psychological tests and 
Psychiatric criteria, it must be borne in mind that even under 
favourable conditions this criterion has a reliability of only 294. 
and that consequently even a perfect test of neuroticism could not 
Correlate higher than V-534. = °731 with this criterion.’ 

. While it might be objected that a diagnosis is not the same 
thing as a rating, it is in fact difficult to draw a line between the 
two; a rating of ‘emotional instability’ is often de facto equivalent to 
à diagnosis of ‘neurotic’ in certain situations. On the whole, these 
arge-scale data bear out the widely-accepted and well-docu- 
mented view that valid and reliable evidence of the personality of 
the person rated is difficult to obtain by means of ratings, and 
Should never be accepted without empirical verification (Vernon, 
1938). Ratings may give suggestive results, as for instance in the 
Important attempts of Cattell (1946) to isolate the fundamental 
Source traits in the ‘personality sphere’, but as Cattell himself 
Would be the first to agree, it is not until we go from the sphere of 
Tatings into the field of testing congruence of ratings with objective 
measures that our findings are worthy of general acceptance. 

atings, fundamentally, are the product of an interaction between 

Wo personalities—rater and ratee—and as long as the former 
Cannot be regarded as forming a constant part of the total situa- 


Чоп, so long will the final rating be partly an indication of the 
a perfectly reliable and 
udgments are perfectly 
nly approach this con- 
ests, and extracting 


the [iene ‹ : 
e limits of sampling errors, a valid and relia! с С 
Uroticism, It is interesting, in this connection, to note that in a previous study 


Ы Ong factorial lines the reliability of the psychiatrist’s judgment Жаз computed, 

teste the inverse of the above equation for the maximum validity ofa psychia- 

SE 5 rating, i.e. using the psychiatric rating’s known validity in terms of its 

Geen Saturation to find the psychiatrist’s reliability (Eysenck, 1947)- The 
idity was 771, the reliability was :71? = '50, which agrees admirably with 
© figure found empirically in the R.A.F. study under discussion. 
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rater’s personality as well as reflecting to an unknown degree the 
personality of the subject who is being rated. 


(2) Pencil and Paper Tests 


(а) Questionnaires.—The evidence in this field shows quite clearly 
that under suitable conditions questionnaire responses can be relied on 
to give excellent discrimination between normals and neurotics. 
One example has already been given on page 56, where it was 
shown that fifteen separate questionnaire-type scales gave positive 
correlations of varying size with the normal-neurotic criterion; it 
was also shown that the intercorrelations of these scales followed a 
pattern which could be shown by means of criterion analysis to 
support the general hypothesis of a general factor of neuroticism or 
stability advanced here. The actual discrimination obtained by the 
use of a questionnaire based on the first of the fifteen scales des- 
cribed is shown in Figure 11; it will be seen that when a cutting 
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score of 9 is used, only 10:6 per cent of neurotics are misclassified 
as normals, and 28:6 per cent of normals misclassified as neurotics- 
The use of weighted scores gives a slight improvement, but the 
gain would not be sufficient to compensate for the complexity of 
the procedure. 

Similar results may be quoted from many different sources: 
Typical of questionnaires used in this connection is the Maudsley 
Medical Questionnaire, which is reproduced below (Table XVII) 
In Table XVIII are given the scores (number of ‘Yes’ answers) of 
1,000 normal and 1,000 neurotic (discharged) members of H.M- 
Forces; these figures are shown in diagrammatical form in Figure 
12. A more detailed comparison of answers to individual items !$ 
given in Figure 13, which shows percentage endorsements О 
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normals and neurotics respectively for the sixteen most discrimin- 


ating items of the scale. 
TABLE XVII 
THE MAUDSLEY MEDICAL QUESTIONNAIRE 


NAME. Vicus op des enis HS Vas EN wale ers wars ARMY NEE agin eg еа 


(1) Do you have dizzy turns? 
(2) Do you get palpitations or thumping in your heart? 
(3) Did you ever have a nervous breakdown? 
(4) Have you ever been off work through sickness a good deal? 
(5) Did you often use to get ‘stage fright’ in your life? 
(6) Do you find it difficult to get into conversation with strangers? 
(7) Have you ever been troubled by a stammer or stutter? 
(8) пе you ever been made unconscious for two hours or more 
y an accident or blow? 
(9) Ds you vont too long pu experiences? 
o you consider yourself rather a nervous person: 
(11) Are your decli Шу, hurt? М к 
(12) Do you usually keep in the background on social occasions? 
(13) Are you subject to attacks of shaking or trembling? 
We Are you an irritable person? У 
däi Do ideas run through your head so that you cannot sleep? 
(1 ) Do you worry over possible misfortunes? 
A eg you rather shy? 
ану ace iin ini eei happy; 
arent reason? 
Us) m you daydream a lot? .— рег? 
(21) De you seem to have less life about you than fot ers? 
Do you Hes geta pam over your heart? 
you have nightmares: 
(4) Hadr iow cae lid a jour deep? 
(20) [d you sweat a grcat deal without exercise? 
(25) De you find it difficult to make friends? 
БЕ. your mind ‘often wander badly, so t 
Л you are doing? 
Ge D € you touchy mM subjects? 
(30) Do you often fecl disgruntled? 
31) ae you often feel just miserable? г Sieft 
(32) De you often feel self-conscious in the presence of superiors: 
33) Det suffer from sleeplessness ? — SE 
eo ever get short of breath without having done heavy 
(80 Be you ever suffer from severe headaches? 
(36) ie you suffer from ‘nerves’? А 
(37) De. you troubled by aches and pains? 
(38) О you get nervous in places such as lifts, 
(39) De you suffer from attacks of diarrhoca? 
(40) pe you lack self-confidence? de w 
е you troubled with feelings of inferiority? 


sometimes depressed, without 


hat you lose track of 


trains or tunnels? 


Yes 
Yes 
Yes 
Yes 
Yes 
Yes 
Yes 


No 
No 
No 
No 
No 
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TABLE XVIII 
Score Normal Neurotic 
0-2 118 55 
2-5 171 34 
6-8 187 52 
9-11 165 66 

12-14 120 Don ' 
15-17 101 81 
18-20 65 96 
21-23 42 123 
24-26 21 124 
27-29 12 144 
go+ I 145 
Average: 9:98 20-01 
— 
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Figure 12.—Distribution of Scores 


Comparable results to those quoted above are available fro™ 
the selection and screening work of the U.S. Navy, an example 0 
the results of which is given in Figure 14. The differentiatio™ 
between well-adjusted, doubtful, and discharged sailors is quit? 
clear and springs to the eye (Stuit, 1947). The War Shippiné 


% ENDORSEMENTS 


EE NEUROTICS 
Г) NORMALS 


[Ганаа ep 
mmm ^ 


ЇЇ ШШШ à 


g items of the Maudsley 


Figure 13.—Percentage of endorsements of normal and neurotic subjects on sixteen most discriminatin, 
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TABLE XVIII 


Score Normal Neurotic 
0-2 118 55 
375 171 34 
6-8 187 52 
9-11 165 66 
12-14 120 8o 
15-17 101 81 
18-20 65 96 
21-23 42 123 
24-26 21 124 
27-29 12 144 
30+ | I 145 
Average: 9:98 20-01 


1000 NORMALS 1000 NEUROTICS 


Figure 12.—Distribution of Scores 


Comparable results to those quoted above are available fro™ 
the selection and screening work of the U.S. Navy, an example 9 
the results of which is given in Figure 14. The differentiation 
between well-adjusted, doubtful, and discharged sailors is quite 
clear and springs to the eye (Stuit, 1947). The War Shippiné 
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Administration has also published some of the results achieved 
through the use of neuroticism questionnaires; Figure 15 shows a 
typical picture of the differentiation achieved between normal and 
neurotic merchant seamen (Killinger, 1947). There is little advan- 
tage in multiplying instances; it is hardly possible to deny that 
unselected subjects make scores markedly different from those 
made by subjects diagnosed as ‘neurotic’ by psychiatrists, and 
separated from the service. ' 
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It might be maintained that these results are artefacts due to 
the selection process, in the sense that it is only after their break- 
down and separation from the service that neurotics give answers 
differing from those of normals; if that were so it would not be 
possible to use questionnaires in attempts to predict breakdown, 
and in screening out those recruits most prone to neurotic dis- 
ability. Three studies tend to invalidate this common objection. 

In the first place, questionnaires tend to show high correlations 
with psychiatric opinion. Thus in one experiment in which over 
500 subjects were given the Maudsley Medical Questionnaire, and 
interviewed by a psychiatrist, a correlation of + -70 was obtained, 
although the psychiatrist was of course quite ignorant of the 
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questionnaire results (Eysenck, 19475). But as we have shown in 
the preceding section, the validity of psychiatric ratings for future 
breakdown has been established; it would seem to follow that 
methods correlating highly with psychiatric assessment would also 
show a certain degree of validity. 


40 
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Figure 15 


In the second place, questionnaire results within an unselected 
Population can be shown to be related to independent variables 
Such as age and education in precisely the same way in which 
these variables are related to the incidence of neurosis. ‘If we 
examine the age-education distributions of the cross section of 
White enlisted men... and the hospitalized psychoneurotics 


from among their number . . - it is at once apparent that there is 


ап excess of psychoneurotics in exactly those classes in which the 


Anxiety Symptoms Index (the questionnaire used in these studies) 
Was shown to select relatively higher proportions—among older 
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men and among the less well educated men’ (Stouffer, 1949). This 
striking agreement argues strongly in favour of the general 
validity of the questionnaire results. 

In the third place, we have in addition to such indirect evidence 
as has been presented above, direct evidence regarding the fore- 
casting efficiency of questionnaire items, and followed up over a 
period of six months. A sample of 73 men who had ha4 a neurotic 
breakdown subsequent to their test was compared with a sample of 
730 men who had had no such breakdown subsequent to their test, 
and who were matched with the ‘breakdown’ group for age, 
education, and marital status. The results of this experiment 
(Stouffer, 1949) show very clearly marked differences in question- 
naire responses between the experimental and the control group. 
This result is all the more interesting as typical neuroticism ques- 
tions were intermingled with questions dealing with various aspects 
of morale; the former (e.g. ‘In general, how would you say you 
feel most of the time, in good spirits or low spirits?’ or ‘What kind 
of physical condition are you in?’) were far superior in differentiat- 
ing the two groups than were the latter (e.g. ‘In general how well 
do you think the Army is run?’ or, ‘In general what sort of time do 
you have in the Army?"). It seems clear, then, both from our dis- 
cussion of the indirect evidence and from the direct evidence just 
quoted, that questionnaire responses do not only differentiate 
between acknowledged neurotics and normals, but also that such 
responses can be used to single out the prospective neurotic from 
a group of unselected recruits. 

This general conclusion is in agreement with Vernon’s (1 938) 
appraisal of questionnaire studies: ‘We are probably justified in 
concluding that they do measure psychologically significant vari- 
ables when the testees are adequately motivated to give candid 
responses.’ In a similar vein, Ellis (1946, 1948) concludes his review 
of some four hundred validation studies by saying: ‘Military 
applications of personality inventories have yielded enough favour 
able results to command attention. In contrast, personality inven- 
tories in civilian practice have generally proved disappointing. 
Regarding the use of questionnaires in civilian practice, he holds 
that ‘the older, more conventional, and more widely used forms 9 
these tests seem to be, for practical diagnostic purposes, hardly 
worth the paper on which they are printed.’ In support of this 
statement he presents a summary table showing that validation 
studies in non-military groups very frequently show ‘negative 
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validity’, which appears to be his somewhat surprising way of 
saying that validity coefficients are below + -40. While his use of 
descriptive epithets is unusual—a validity coefficient of + -69 
would be called by him ‘questionable positive'—here is little doubt 
that he has succeeded in showing that questionnaires in civilian 
work are less valid than in military research. 

This diversity of results obtained with questionnaires underlines 
the dependence of results on the motivation of the subjects, and the 
general conditions under which the testing is carried out. As long 
as neither motivation nor conditions can be adequately controlled, 
so long will it be impossible to regard questionnaires and inven- 
tories as anything more than supplementary evidence, to be 
regarded with suspicion unless shown to be valid in the particular 
set of circumstances which happens to be under investigation. 
Under favourable conditions, as the figures quoted above show, 
questionnaires can be of very great utility and scientific value; 
under less favourable conditions, validity coefficients may approach 
zero, or even be negative. As in the case of ratings, it is difficult to 
see how a scientific analysis of human personality can proceed on 
such an unreliable, shifting basis. 


[-Wzuwi ]9-54 ) йз Neurotics "Some 


LM=61|o=37 200 Original standardization group 


(Eto To 241] 200 Middle-class sample— both sexes 
233 Middle-class sample—both sexes 


143 Industrial workers—male 
126 Nurses—female 


Industrial Workers 


[Mzio7 [о =o1 ) 34" 
Members of Guards Regiment— 
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Navy continuous service person- 


904 
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2,562 Total Sample 


Figure 16,— Description of Sample 
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(b) Word Connection List.—This test, which is essentially an 
‘alternative choice’ version of the word association test, is repro- 
duced in Table XIX. Of the two responses offered for each 
stimulus word, one is more frequently chosen by neurotics, the 
other by normals (Crown, 1948). Total score is the number of 
neurotic responses underlined. The validity and reliability of this 
test has been established on large numbers of different populations, 
and representative means and standard deviations are given in 
Figure 16 to illustrate typical results. These values, which are taken 
from Crown's (1951) paper summarizing work on this test, show a 
marked difference between the neurotic and the normal groups. 
Further results will be quoted later on to substantiate the claims 
made for this test. 

(с) Annoyances, Interests, Worries, etc.—Many pencil and paper 
tests make use of lists of items which might be considered annoy- 
ing, worrying, interesting, or in some other way related to the 
emotional life of the subject; the number of items underlined or 
crossed out is then taken as the score. This type of test, pioneered 
by Pressey in his X—O test, is less direct than a questionnaire, but 
more direct than the word connection list; consequently its useful- 
ness would appear to lie between these two other methods of 
eliciting information through pencil and paper responses. The 
experimental study to be described was carried out by Bennett 
(1945) and Slater (1945) on 80 normal and 80 neurotic subjects. 
The following tests were used: 

(1) A neurotic inventory divided into three sections dealing 
with questions related to the clinical syndromes of anxiety, hysteria, 
and depression. (2) An annoyances test, listing sixty possibly 
annoying stimuli or situations of four kinds, fifteen of each, the 
whole in random order. These four types of stimuli or situation 
are: (a) frustration of self-assertion, (b) personal inadequacy, (с) 
dirt or untidiness, (d) noise. (3) Three sections of the Pressey х-О 
test, modified from the original, and dealing with: (a) activities for 
which an individual should be blamed, (b) things about which he 
has ever felt worried, nervous or anxious, (c) items which he likes 
or in which he is interested. Scores on these ten sub-tests were 
obtained, and intercorrelations established for the two groups 
separately, as well as for the combined group of normals ап 
neurotics. Biserial correlations were calculated between the normal- 
neurotic dichotomy and each of the ten tests; these are reproduce 
under ‘D’ in Table XX. Also given are first-factor saturations for 
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TABLE XIX 
WORD CONNECTION LIST 


5ОЕМАМЕ.................. Christian М№атеѕ............. бузга одаар 
ыр Бу you will find a list of familiar words printed in capital letters, each 
E en by two words in small letters. Look at the word SINK in the example 
k l ow. Now glance at the two other words. Does the word SINK make you 

ink more of ‘wash’ or of ‘drown’? Draw a line under whichever word is more 


connected in your mind with SINK. 
ExamPLE: SINK wash drown 


Г geng people connect ‘wash’ with SINK and so they underline ‘wash’ as 
d ows: SINK wash drown. Other people connect ‘drown’ with SINK and 
crefore they underline ‘drown’ as follows: SINK wash drown, There are 


n = Е T 
© right or wrong answers because one word connection is just as good as 


another. Just look at the two words that follow the word in capital letters and 


underline the one that you feel is more connec 


ted in your mind with that word. 


WORK FAST. Don’t stop to think long about any one word. Be sure not to 


leave any words out. 


SCISSORS nurse cut 
HANDS feet moist 
LOUD yell soft 
WOMAN girl trouble 
LION eat 


tiger 
LIGHT dark sentence 
BLUE sad sky 


STOMACH food ache 
LEFT home right 
THOUGHTS ideas 
NW beware fast 
OY girl mischief 
SHORT little tall 
pONTENTED happy discontented 
IRY shameful wand 
UNHAPPY no yes 
THIRSTY dry drink 
e E drinking fasting 
H L useless good 
rome weight heart 
geg ZE lawyer sorrow 
FRIE Ocean hurt 
Poor D double-crossed close 
hand tingle 
work woman 


strange 


Re ` í 
member: There are no right or wrong answers. Go as fast as 


WEIGHT scale losing 
DIGNIFIED snobbish poised 
TALKED spoke about 
SLEEP nightmares bed 
RIVER lake danger 

BABY foundling little 
LOSE find mind 
SALMON dislike 
HUNGRY thirsty 
MAN hard boy 
MUTTON eat flesh 
SWIFT hurricane slow 
BRING take disaster 
SWEET affected bitter 
FOOD stomach poisoned 
RAW deal meat 

PARTY crowd myself 
BITTER medicine sweet 
GRAVE serious funeral 
WOMAN excitement man 
SOUR lemon stomach 
CAN'T concentrate fly 
PINT quart whiskey 
NEEDLE drug sharp 
WOUND bandages feelings 


like 
heart 


you can. 
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the normal, the neurotic, and the combined groups. Statistical 
tests showed the feasibility of combining the two groups, by com- 
paring the hypothetical matrix derived from the combined analysis 
with the correlation tables for the separate groups (p = ‘то in each 


case). 
TABLE XX 


First Factor 


Neurotics Normals | Combined 
| = 
Inventory: Anxiety "77 чо | 37 BD 
Hysteria 49 '55 `53 43 
Depression -78 WÉI -70 WÉI 
Annoyances: a -18 03 13 25 
b 46 "50 47 49 
с 12 "21 "13 ао 
d 63 | 48 Ki 72 
Pressey X-O: a об — :02 — :08 +19 
b 43 55 49 65 
c 29 —:29 — +18 +56 


It will be seen that there is a high correlation between the 
criterion column (‘D’) and the first factor saturations of the groups, 
whether separate or combined (r = -64 for the combined values). 
This fact may be taken as another indication of the correctness of 
the hypothesis of a general factor of ‘neuroticism’, as can also the 
fact that the two matrices derived from the normal and the neurotic 
subjects show patterns of intercorrelations substantially identical. 

Standard deviations of summed weighted standardized scores 
were estimated separately for neurotics and normals, giving values 
of 2:17 and 1:74, an insignificant difference statistically. As Slater 
comments, ‘this is an interesting finding. If neurotic personality 
were a unitary characteristic persons selected for the possession of 
it would have more homogeneous scores of tests which measure it 
than unselected persons; instead they are found slightly, though 
not significantly, more heterogeneous.’ Here we have an argument 
independent of criterion analysis which essentially leads to the 
same conclusion of the quantitative nature of neuroticism, 45 
opposed to the qualitative concept. 

(d) Food Aversions.—The common psychiatric observation that 
neurotics tend to have special emotional attitudes to food has been 
quantified by Wallen (1945) in the form of a food aversion list. A 
list of twenty foods is read out to the subjects, and aversions, aS 
well as the reasons for dislikes, are recorded. Alternatively, printe 
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questionnaires may be used; there appears to be little to choose 
between the two methods. Wallen has shown that neurotics report 
on the average a significantly larger number of food aversions than 
do normal recruits. 

Gough (1946) applied this list to 79 neurotic and 254 normal 
soldiers, and found mean number of aversions of 5:14 for neurotics 
and 1:23 for normals, a very significant difference statistically. 
Less than 5 per cent of the normals disliked more than three foods, 
while about 70 per cent of the neurotics did so. Table XXI gives 
the actual food names used, the percentages of normals and 
neurotics disliking each, and the CR of each difference. The most 
interesting feature in this table from our point of view lies in the 
fact that the dislikes of normals and neurotics are highly correlated 
(tho = :82); in other words, foods disliked by neurotics are also 
disliked by normals, but not to the same extent, and similarly, 
foods liked by normals are also liked by neurotics, but not to the 
Same extent. Here, then, we have again support for the quantita- 
tive view of the difference between normals and neurotics. 


TABLE XXI 


PERCENTAGE OF AVERSIONS FOR NEUROTICS AND NORMALS 


ON EACH OF TWENTY FOODS 
EM m €——Á 


Mna | В ereentage | Critical 
m | 'eurotics orma Ratio 
ш Disliking Disliking 

on 29-1 go 45 
Grapefruit juice 38-0 43 a 

ome 342 55 e 
Otato soup 45°6 51 8:9 
шп 31:6 8:7 ER? 
Check 13 0 1:8 
Ба! chops 20°3 24 5:6 
еа 8-9 1:2 3:5 
SS 16:4 1:2 55 
viage cheese 51'9 14:6 6:8 
Gë ses 367 | i e 

€ be; 24- 7 Я 
Cabbage ~ (broad beans) Ede be e 
Ma 12:7 16 43 
Mushrooms 532 252 4 
pais 20-6 98 38 
9matoes Së dd d 
antaloupe 13:9 35 3:4 
parre 13:9 © 6-0 
ears E 2 jn 

ELM 


Eege 


106 THE NEUROTIC DIMENSION: 


(3) Objective Behaviour Tests 


In a sense, the major part of this book deals with objective 
behaviour tests of one kind or another, and it may seem superfluous 
to add a special section dealing with them here. However, as this 
type of test is still relatively unknown to many psychiatrists and 
clinical psychologists, and neglected almost completely in the 
practical work of the clinic, the hospital, and the workshop, it 
seemed worth while to present briefly some of the main develop- 
ments this type of test has received by psychologists like Luria and 
Davis in the field of motor control, Hull in the field of suggestibility, 
Ryans and others in the field of persistence, and various members 
of the writer's department in several other fields. No attempt will 
be made to duplicate the summaries given in Dimensions of 
Personality of these various areas; only one major rescarch will be 
quoted in each case to establish landmarks, as it were, to guide 
later interpretations of factorial studies. 

(a) Suggestibility.—1t has been shown elsewhere (Eysenck, 1947) 
that the most convenient, most reliable, short test of this personality 
trait, established in two factorial studies, was Hull's Body Sway 
test. In this test the subject, standing with his eyes closed, his hands 
hanging down by his side, and his feet together, is made to listen 
to a gramophone record which repeats the words: ‘You are falling, 
you are falling forward, you arc falling forward all the timc. You 
are falling, you are falling, you are falling now. . . .' His maximum 
forward or backward sway in response to this suggestion during the 
24 minutes’ duration of the test is his score, and it has been shown 
in several distinct researches that the amount of sway, and the 
frequency of complete falls, increases with an increase in the 
degree of neuroticism.! As an example we quote below results from 
960 male and 390 female subjects; 60 subjects in each group were 
normal, the others were neurotics rated by psychiatrists on a six- 
point scale as to severity of disorder, I being the mildest, VI the 
most severe grade (Figure 17). The amount of body sway in inches 
is shown for each of these groups; the number of cases in each 
group is given in brackets underneath each column. It will be seen 
that there is a completely regular progression in amount of sway 
from the normal group through the intermediate neurotic groups 
to the most neurotic group, both for the men and for the women. 
Where a sway of 1 inch is about the normal average, the most 


1 This test is illustrated in photographs 5, 6, 7, 8. 
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severe neurotic groups swayed 5:55 and 6-72 inches respectively 
for the two sexes. This study is but one of several showing a strong 
correlation between neuroticism and suggestibility. In none of the 
studies, incidentally, were any marked differences noted between 


hysterics and other types of neurotics. 


7 


6 5.50 5.55 


m [^] ь 


SUGGESTIBILITY: 


Y vi 


0 
960 NORMAL I D ш IV 


MALES (N-60) (N-54) (N-132) (N=247) (N=244) (N=154) (N-69) 
6.72 


SUGGESTIBILITY: 
© 


0 v ҮІ 
І п ш D d 
FEMALES: (NO) (N=13) (N-54) (N-90) (N=100) (N=54) (N=19) 
Figure 17.—Average suggestibility of Normals and Neurotics, showing 
increase in suggestibility correlated with increase 1n ‘Neuroticism 


(b) Manual Dexterity —While tests of this type have usually been 
regarded as tests of ability pure and simple, it has been shown that 
Very large and significant differences can be observed between 
normals and neurotics with respect to various tests 1n this group, 
Such as the O'Connor Tweezers test (Eysenck, 1950); for instance, 
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or the U.S.E.S. tests (Lubin, 1951).! Correlations with the neurotic- 
normal dichotomy are іп the neighbourhood of + -6o. Psychotics 
also appear to be deficient in these tests, but their pattern of 
disability is very different from that of the neurotics, as will be 
shown later. 

(c) Level of Aspiration —The general technique of level of aspira- 
tion tests requires the repeated performance of a task (usually a 
mechanical one) which can be carried out with varying degrees of 
speed, accuracy, or goodness; it is essential that the variable chosen 
should be expressible in numerical terms (number of seconds, 
number of errors, etc.), and that the subject should not be able 
from his performance to estimate with great precision his actual 
score.? The subject is made acquainted with the task; he is then 
asked to give an estimate of how well he expects to do on his next 
performance. (This estimate must be made in quantitative terms, 
of course.) This estimate is called his ‘aspiration score’. He then 
performs the task, and is asked to give an estimate, again in 
numerical terms, of how well he has done; this estimate is called 
his ‘judgment score’. He is then told what his actual score on the 
performance was (his ‘performance score’), and asked to estimate 
how well he will do next time. This procedure is repeated a number 
of times—usually from five to ten repetitions are required to get 
reliable data. Two main scores are derived from these data: 
(1) Mean goal discrepancy score. This is based on the difference 
between actual performance on trial X and aspiration for trial 
(X + 1); it is positive if the aspiration is higher than the preceding 
performance, and negative if it is lower. Discrepancy scores are 
averaged for all trials. (2) Mean judgment discrepancy score. This 
is based on the difference between actual performance score and 
judgment score on trial X; underestimation of past performance 
gives a negative score, overestimation a positive one. 

The relation of these scores to neurosis appears to be curvi- 


1 These and several other tests used at various times are illustrated in 
photographs r9, 20, 23, 25, 26. 

? The most frequently used task in our work has been the Triple Tester» 
illustrated in photograph no. 10. The subject is required to manipulate the 
handwheel so as to move the stylus which plays on the drum; the connection 
between wheel and stylus is through an integrating disc. Object is to touch wi 
the stylus as many holes in the ivorine cover of the drum as possible; hits are 
scored automatically on an electric counter (Eysenck, 1947). This instrument 
was developed by the Cambridge Psychological Department for quite different 
purposes, but proved to be particularly useful for work on level of aspiration 
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linear rather than linear. We find that when the neurotic sample 
is broken up into the introverted (dysthymic) and extraverted 
(hysteric) groups, these groups show large differences. Introverts 
have high levels of aspiration (high positive goal discrepancy score) 
and severely underestimate their past performance (high nega- 
tive judgment discrepancy scores). Extraverts have low levels of 
aspiration (low positive or even negative goal discrepancy scores) 
and show no tendency to underestimate their past performance— 
indeed, they may overestimate it (low negative or even positive 
judgment scores). Normals in either case are intermediate between 
these two extremes, having medium high positive goal discrepancy 
Scores and medium high negative judgment discrepancy scores 
(Eysenck, 1947; Himmelweit, 1946; Miller, 1951). Figure 18 gives 
a schematic drawing of these relations. It follows from what has 
been said that differences between a normal group and a neurotic 
group would depend on the proportion of hysterics and dysthymics 


HYSTERICS NORMAL! NORMAL DYSTHYMICS 
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Judgment Discrepancy Scores. 
Figure 18 
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in the neurotic group: unless this is controlled it is difficult to make 
any prediction. 

Not only do normals and neurotics differ with respect to actual 
scores; they also differ with respect to the pattern of scoring. When 
goal and judgment discrepancies are correlated, these correlations 
tend to be positive for normal and negative for neurotic groups. 
As an example, we may take the following correlations, based on 
male and female groups respectively: + -59 and + :39 for normals, 
and — :68 and — :39 for neurotics. These differences are presum- 
ably due to the peculiar curvilinear relation between neuroticism 
and the discrepancy scores. 

(d) Motor Response Disorganization.—lhe view that under 
emotional stress there is a disorganization of motor processes, and 
that this disorganization will be more noticeable in neurotics than 
in normals, is not a novel one, and may be traced in its experi- 
mental development to the early work of Nunberg at the time of 
the First World War, who combined the Word Association test 
with measurement of hand movements of an involuntary kind 
(1918). 

These early and technically somewhat unsatisfactory studies 
were taken up by Luria (1932), in a series of researches too well 
known to be detailed here. Although others, such as Olson and 
Jones (1931), Crosland (1931), Huston et al. (1934), Barnacle et al. 
(1935), Houtchens (1935), Runkel (1936), Burtt (1936), Krause 
(1937), Gardner (1937), Shuey (1937), Speer (1997). Sharp 
(1938), Reymert (1938), Morgan and Ojeman (1942), Albino 
(1948), and particularly Clarke (1950), have shown the essential 
correctness of the general thesis through use ofthe Luria technique; 
the example given here is taken from the methodologically some- 
what different work of Davis (1948). 

This writer carried out a series of investigations into the causes 
of pilot error. These experiments were carried out in a simulated 
Cockpit, which is similar to the Link trainer, except that it does 
not itself move; instruments on the panel respond realistically to 
movements of the controls. The effects of control movements were 
less complex than those of an aircraft, but the test was apparently 
accepted by the subjects as an exercise in instrument flying. This 
exercise consisted of a series of four manœuvres, together occupy- 
ing ten minutes, repeated between intervals of straight and level 
flying. Only the periods of manceuvre were scored; instructions 
and recording were so arranged that a perfect performance wä? 
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recorded as a straight line, any deviation indicating an error. 
Other detailed records of performance which were necessary were 
also made. 

Two types of error were observed; errors of overaction and errors 
of inertia. Subjects exhibiting overactivity ‘obtained large scores on 
control movements. Their errors in instrument reading were small 
and of short duration. ... Responses to instrument deviations 
Were excessive, the extent and gradient of the movements being 
greatly increased and over-correcting frequent, with the result that 
secondary responses were required. Numerous restless movements 
were observed. . . . Subjects felt excited and under strain, tense 
and irritable and sometimes frankly anxious. They felt that correc- 
tion was urgent and made it impatiently. . . . Although . . . sub- 
Jects were dissatisfied with their performance, they were not dis- 
couraged, but keen to improve and continued to try to do the test 
Well... . Several pilots reported preoccupation by the test for 
Some time after it was finished. Some returned later in the day and 
asked for a further opportunity of doing the test." т 

Subjects exhibiting the inertia reaction made ‘errors which 
Were large and of long duration, whereas activity, represented by 
Scores of control movements, was relatively little. . . . The indi- 
Vidual responses were less hurried and less disturbed by restless 
Movements than were those of the pilots showing the overactivity 
reaction, but they were often more extensive than at the beginning 
of the test, due probably to the larger size of the instrument devia- 
lions and the tendency to make responses proportionate to the size 
9f the deviations. . . . Subjects: reported that their interest had 

agged and that their concentration had failed. A feeling of strain 

ad now given way to one of mild boredom, tedium or tiredness. 
+. In contrast to the restless striving of the overactive class, the 
Pilots in the inert class gave the impression that they had lowered 
their standards of performance to a level well within their үт. 
ney Subjects? judgment of the degree of accuracy to which they had 
attained was usually faulty, and they were unaware of the degree 
to which they had failed to correct deviations of the instruments. 
} ++ There was... an emotional indifference as far as the test 
itself was concerned.’ | : 

These results are closely in accord with Nunberg's observation 
11 his original work, which led him to posit the existence of two 
lypes of individual, ‘the one showing inhibition under affect (the 
era reaction), the other exhibiting increased excitability (the 
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overaction response). They also link up remarkably well mo 
experimental work on the extravert-introvert dimension repor! 2 
in Dimensions of Personality (Eysenck, 194.7) showing that Geen 
(like the overactive response type) tend to show anxiety, СА " 
irritable, to have a high level of aspiration, and to be dissatis x) 
with their performance. Extraverts (like the inertive response type f 
tended to show emotional indifference, to have low levels S 
aspiration, adjusted to actual performance, and to be satisfie 
with their performance, however poor. ch 
These experiments thus give rise to two main hypotheses whic 
admit of proof or disproof: (1) Pilots who are neurotic should show 
a larger proportion of abnormal (overactive or inert) reactions 


than pilots who are not diagnosed as neurotic, and (2) pilots who 
are diagnosed as suffering from dysthymic disorders (anxiety state) 
should show the introverted, ovi 


eractive behaviour pattern, while 
pilots suffering from hysteric disorders should show the extra- 
verted; inert behaviour pattern. Both these hypotheses were tested 
by Davis on 355 normal and 39 neurotic pilots. Taking the com- 
parison between normal and neurotic pilots first, it will be seen 
from Table XXII that both predictions are verified: 75 per cent of 

TABLE XXII 
TEST RESULTS OF NORMAL AND NEUROTIG PILOTS 


Reactions Total 
N 
Normal Overactive Inert 
5 o 

Normal Pilots 268 8, e E^ 18 à 355 
Neurotic Pilots 13 33 11 28 1 38 39 
Acute Anxiety State 6 43 7 20 E 7 14 
Hysteria I | 12:5 1 12:5 6 75 З 
Other $5 3] 3 | wl ail 17 


for Level of Aspiration experiments. 


10. Cambridge Triple Tester, used 


. Psycho 


-galvanic reflex apparatus, with electrodes. 


12. Tracing board. 
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significance, that the acutely anxious (introvert) type of patient 
tends to make more extensive responses than did the normal 
subject, while the hysteric (extravert) patient tends to make less 
extensive ones. This apparatus is illustrated in photographs 13 and 
14. The task consists of aligning the central pointer with the line at 
the right or at the left, according to the brightness of two lights 
flashed on at both sides. Movement of the pointer is mediated 
through an integrating disc, and is produced by turning the hand- 
wheel. Provision is made for automatic recording of movements, 
time sequences, etc. (Cf. Davis (1948).) 

Our hypothesis of a normal-neurotic continuum would also 
require that within the normal group there should appear a correla- 
Don between neurotic predisposition and percentage of abnormal 
responses. The percentage of abnormal responses made by the 355 
normal pilots is given in Table XXIII after they had been divided 
D three groups according to psychiatrically assessed degree of 
ncurotic predisposition; it will be scen that there is an increase in 
abnormal responses from 16 per cent in the group showing least 
Predisposition, through 27 per cent to 46 per cent in the group 
Showing the greatest amount of predisposition. Thus this third 
hypothesis is also confirmed by Davis's results at a reasonable level 
Of statistical significance. We may take it as established, therefore, 
that under emotional stress there is a disorganization of motor 
Processes, a disorganization which is closely correlated with 
neuroticism or lack of emotional stability. 


TABLE XXIII 
ASSOCIATION OF NEUROTICISM AND TEST SCORE 


Predispositi | Е 
| 
Ni 


il 

В 130 I 12 
Night 2 a 9 

oderate 25 14 7 
а ые айу 


(e) Body Control.—Several experiments on static ataxia, the 
Heath rail walking test,! and other tests of body control have 
Shown that neurotics are notably inferior in this respect. Results 
Shown below in Figure 19 give data on a simple static ataxia test 

120 normal and 1,230 neurotics, in which instructions were 


St, 1 Illustrated in photograph 18. 
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i i losed and the 
imply to stand still and relaxed, with the eyes c 
lad hanging down the side, feet together. The test lasted ec? 
30 seconds only, and maximum sway either forward or backwar 
was the score of the subject who was being tested. 


N 120 


«vr 


NORMAL GROUP 


NEUROTIC GROUP 
Figure 19.—Static Ataxia Test of Body Controls N— 1,230 


(f) Rorschach Test.—1It i 
exactly the same manner a 


example chosen is a factorial 


A. Sen (1949). The importance of this stud 
points. In the first place, the subjects were Indi 


udy enables us to form an opinion on 
traised by Burt (1915). In his analysis 


he British Association. ‘The analyses, 
172 normal children and 157 normal adults, 
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led to the hypothesis of a common factor underlying all the primary 
emotions, which was termed “general emotionality” (by analogy 
with “general intelligence’’).’ In a recent paper Burt (1950) has 
taken this notion up again, and has contrasted the factor of 
‘emotionality’ with the factor of ‘neuroticism’. ‘A number of 
investigators who have carried out factorial studies on neurotic 
cases have found a similar general factor. Many have inferred that, 
since this factor is common to all the traits assessed for such groups, 
it can therefore be designated a “general factor of neuroticism”. 
- .. But a general factor for neuroticism would be a factor that is 
found among neurotic cases only or among traits peculiar to 
neurotics; and to prove such a conclusion it would be essential to 
examine a control group of normal persons, and demonstrate that 
the factor in question is not found among the normal.’ 

This argument is rather curious, as the general factor of 
neuroticism has always been presented as constituting a continuum 
ranging from the most extreme instability to the most marked 
Stability on the other side; to imagine that it should appear only 
among the neurotic would be equivalent to saying that a general 
factor of intelligence should appear only among the intelligent, and 
that to prove such a conclusion it would be essential to examine 
à control group of dull persons, and demonstrate that the factor 
in question is not found among the dull! However, arguments 
10 science are not usually as convincing as experimental demon- 
Strations, and Sen’s experiment does enable us to reach a con- 
clusion, if only a provisional one, regarding the nature of this factor. 

Her subjects were given an individual Rorschach test, a verbal 
and a non-verbal group test of intelligence, and Cattell’s test of 
fluency, In addition, each subject was rated independently by two 
Judges for the following traits: Intelligence, verbal ability, imagin- 
ation, perception of relations, general emotionality, extraversion- 
Introversion, assertion-submission, cheerfulness-depression, soci- 
ability, anxiety, neurotic tendencies, and extent of vocational and 
Cultural interests. Ratings were on a fifteen-point scale, inter-rater 
reliabilities ranging from :51 to ‘79, with an average of -71. Scoring 
of the Rorschach was carried out, using Beck’s procedure with two 
minor changes. A slight modification of his ‘Z-score’ was intro- 
duced, and Klopfer’s ‘form level’ score was used in addition. 

. Correlations between scoring categories and the mean intel- 
IBence test score are relatively slight, only four being above the 
5 Per cent level of significance for both men and women. These 
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four categories are: Good form (:46 and 44 respectively), move- 
ment response (:41 and -42), modified Beck ‘Z score (41 and -42), 
and ‘form level’ score (:50 and -59). To obtain evidence on correla- 
tions with non-cognitive ratings, a factor analysis was performed 
on tetrachoric intercorrelations between 36 Rorschach scoring 
categories. Three main factors were extracted, contributing 29, 19; 
and 7 per cent to the total variance. Approximate factor measure- 
ments were calculated for these three factors by reducing the 
original scores to standard measure, and then weighting them with 
the saturations for the three factors. As a last step, in order to 
interpret the factors, these factor measurements were correlated 
with the tests and ratings, which thus form an outside criterion. 

The first factor is one of associative fluency. This is shown by 
the fact that its highest saturation is for total number of responses 
(96), and by the fact that its only significant correlations are with 
the fluency test (-48) and with the rating for imagination (-53) and, 
Very much lower, with the rating for general emotionality. The 
second factor is clearly one of intelligence, having correlations of 
“51 and -46 with the two intelligence tests, and a correlation of :50 
with the rating for intelligence. There is als 
correlation of — -21 with the rating for neu 
is well in line with the frequently observed t 
to show a slight negative correlation with intelligence. 

The third factor is identified more clearly even than the other 
two. Its only significant correlation is with the rating for neurotic 
tendencies (68), but this is the highest of all the correlations 
reported between factors and external criteria. The correlation of 
this factor with general emotionality is only •11, which is quite 
insignificant statistically. When it is borne in mind that the reli- 
ability of the ratings was itself below -8, it will be seen that the 
third-factor scores provide a remarkably good measure of neuroti- 
cism, a conclusion well in line with Cox’s (1951) study of normal 
and neurotic children, and one which supports the general view 
taken throughout this chapter. There is no support at all in favour 
of the view that general emotionality has any claim to be regarded 


as a strongly-marked and important trait of personality. If relevant 
at all, it would appear to be slightly related to the fluency factor, 
and through it 


with extraversion-introversion, as suggested tenta- 
tively in Dimensions of Personality. But the correlation between 
fluency and emotionality (-25) is too low to be of any systematic 
importance, and no other significant correlations are reported for 


о a very small negative 
rotic.tendencies, which 
endency of neuroticism 
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general emotionality. Consequently we may conclude that this 
experiment has failed to bring forward any evidence in favour of 
Burt's claims, and that instead the results fit in extremely well with 
the general scheme of personality description developed here. 


(4) Constitutional Differences 


The type of study reviewed here deals with tests and measures 
which are related not so much to behaviour, but rather to qualities 
of the organism which underlie such behaviour. In the literature, 
this type of work is frequently listed under the general heading 
of ‘physiological psychology’, because constitutional factors are 
obviously closely related to the physiological make-up of the 
Person. It is primarily on the European continent that constitu- 
tional research has flourished, and it may be worth-while to 
emphasize the main difference between the physiological and the 
constitutional approaches. In the writer’s view, this difference is 
one of emphasis. To the physiological psychologist, reactions are 
segmental, and to be studied as far as possible by excluding any 
facets of the personality not immediately relevant to the ostensible 
Problem. To the constitutional psychologist, reactions are not 
Segmental, but always integrated with the whole personality, and 
therefore useful in throwing some light on the person who is 
reacting, as well as on the particular neural or muscular mechanism 
which is being studied. To the physiological psychologist, dark 
Vision is a segmental phenomenon, to be studied in terms of 
Physiological variables, vitamin A deficiency, and the like; to the 
Constitutional psychologist, its interest lies primarily in its high 
Correlation with emotional instability and behaviour disorders—a 
Correlation to be discussed below. These two approaches are not 
antithetical, but complementary; the phenomena studied have 
more than one aspect and need to be studied from many different 
Points of view. 

(a) Body Build (physique).—1n Dimensions of Personality it was 
shown that a correlation existed between body build and neuroti- 
SE Factor analysis was used in establishing two main factors in 

ody build: (1) A general factor of body growth or body size, and 
2) a type factor distinguishing between growth in length and 
growth in breadth. An index of body build was based on this 
Second factor, 
Stature X 100 


LB. 
Transverse Chest Diameter x 6 
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Figure 20.—Saturation of Anthropometric Measurements with General 
and Type Factors—Women 


which discriminated very significantly between normal and neurotic 
soldiers, the neurotics having higher index values, and therefore 
being more prone to the lean type of body build (leptomorph) 
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than the normals who showed a larger number of mesomorphs 
(index of body build within one + S.D. of 100) and eurymorphs 
(thick-set, broad body build, more than one S.D. below тоо.) It was 
also shown that hysterics tended to be significantly more eurymorph 
than dysthymics, who were markedly leptomorph in body build. 

These results were taken from work on men only. Since then, 
further work has been carried out on women by Rees (1950), 
showing that intercorrelations of body measures among them also 
give rise to two factors; diagrammatic representation of these two 
factors—body size and body type—is given in Figure 20. (An 
index of body type for women was derived from the saturations of 
the various measures for the type factor, giving LB, = :59 Stature 
++ -47 symphysis height — :31 chest circumference — -64 hip cir- 
cumference.) Again, hysterics were found to be more frequently 
characterized by eurymorph body build, dysthymics by lepto- 
morph body build. And again, the more leptomorph subjects 
tended to be more severely neurotic. It may therefore be regarded 
as established that body build is related to the general neuroticism 
factor, both in men and in women. 

(b) Dark Vision.—Neurotics have poorer dark vision than 
normals, as has been demonstrated in several independent studies. 
Figure 21 shows the result on the Livingstone Hexagon test of the 


50% NEUROTIC GROUP 


— NORMAL GROUP 


40% 


20% 


Figure 21.—Dark Vision Test Scores 
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scores of some 6,000 R.A.F. personnel and 96 neurotics ;ascore E : 
denotes complete failure in the test situation, while a score o Te 
denotes maximum success. The marked difference between 
normals and the neurotics springs to the eye (Eysenck, 1947). 

(c) Autonomic Imbalance.—The hypothesis that Temone a 
characterized by imbalance of the autonomic nervous system nds 
its strongest support in a monograph by Wenger (1948), olloning 
up his earlier work with children. Abstracting only a few of his 
many important findings, we reproduce in Table XXIV C.R.s for 
autonomic tests used by him on 488 normal and 225 subjects 


TABLE XXIV 
I п III IV 
CR, Factor CR 
Factor ; J 
Item Nasal See он, Normal 
Operational отпа Fatigue pr 
Patios Group Group SE 

10. Salivary Output 2-86 | "25 — 8:13 
12. Salivary pH 2:50 — 48 423 
13. Dermographic Latency 1:00 "15 — KI 
14. Dermographic Persistence "93 чї —о: 61 
15. Palmar Conductance 4°84 19 45 “47 
17. Log Conductance Change 1-06 SU 23 3:36 
19. Volar Conductance 1:43 "19 4 "00 
21. Systolic Blood Pressure 6:52 — 37 446 
23. Diastolic Blood Pressure 8-60 30 — 460 
28. Heart Period 7:82 “60 +36 6-65 
30. Sublingual Temperature 2:50 “47 "45 2:00 
32. Finger Temperature 3:08 03, —+02 3:81 

44. Tidal Air Mean 5:92 = +00 247 ^ 
46. Tidal Air Sigma 42 15 — 1:58 
48. Oxygen Consumption -80 13 35 3°43 
52. Pupillary Diameter оо — оо "00 

- = і a 


perational fatigue. Also given in this table are 
S obtained by him from intercorrelations of the 
normal group only. Only his factor two is considered here, which 
in his view ‘may be defined as representing the autonomic nervous 
system’. A third column of the table is made up from the factor 
saturations obtained by him from intercorrelations of the opera- 
tional fatigue group only (factor 4, page 85). These factors were 
obtained by Wenger following Thurstone’s method of rotation; it 
will be seen that the factor loadings of the tests are roughly propor- 
tional to the discriminatory ability of the tests as shown by their 


suffering from o 
factor saturation: 
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G.R.s. The correlations between columns 1 and 2 in the table, and 
between columns 1 and 3, are :55 and 20 respectively. Both values 
rise somewhat when subjected to criterion analysis, but even with- 
out this refinement it is clear that the factors existing within each 
group are essentially similar to, and identical with, the autonomic 
imbalance factor which causes normals to differ from operational 
fatigue cases, who from our point of view may be regarded as 
falling towards the ‘unstable’ end of the neuroticism factor. 

The fourth column of Table XXIV gives C.R.s for a compar- 
ison on the same set of tests between a normal group of 488 
subjects, and a group of 98 neurotics. This set of C.R.s is propor- 
tional to theone given in column 1, and also to the factorsaturations 
in columns 2 and 3; the correlations are respectively -51, 42, and 
‘38. These values are sufficiently high to support the general 
hypothesis that autonomic imbalance, operationally defined 
through this set of tests, actsin a unitary manner and distinguishes 
between normals and neurotics. 

We have now shown that when we take large groups of persons 
diagnosed as ‘neurotic’ by competent psychiatrists, and compare 
them with large unselected groups of persons who have never come 
under psychiatric supervision, and whom we may for convenience 
label ‘normal’, large differences appear along each of the four 
main types of approach we have outlined. In ratings based on 
interviews, large differences are observed with respect to heredity, 
neurotic traits in childhood, shyness and difficulty in making social 
Contacts, emotional instability, apprehensiveness, dependence, 
marriage and other sexual difficulties, physical ill health, unstable 
Work record, and former psychiatric illnesses. In self-ratings 
Obtained from questionnaires, large differences are apparent with 
Tespect to items referring to dizzy turns, palpitations, worrying, 
nervousness, easily hurt feelings, shaking and trembling, irritability, 
nightmares, sleeplessness, shortness of breath, lack of self-confidence 
and feelings of inferiority. On objective behaviour tests, differences 
are marked on suggestibility, manual dexterity, level of aspira- 
tion, motor response disorganization, and body control. In the 
field of constitutional differences, body build (physique), auto- 
nomic imbalance, and defective dark vision were noted as effective 

ifferentiating tests. This survey of the literature is far from com- 
Plete, but it does depict a mutually consistent pattern of personality 
traits which may be regarded as an approach to the operational 
finition of the personality of the neurotic. It also disproves, at a 
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very high level of confidence, the null hypothesis; there is very 
little doubt that the group of ‘neurotics’ is differentiated from the 
group of ‘normals’ on a basis distinctly different from chance. 
We have also been forced to postulate that the ‘neurotic’ group 
shows a lack of homogeneity which cannot be attributed to the 
general factor of neuroticism, and our hypothesis has been that we 


are dealing with a second factor, orthogonal to the neuroticism 


one, which resembles in many important aspects Jung’s extravert- 


introvert dichotomy. The introverted neurotic shows symptoms 
of anxiety, depression, and irritability; he has overly high levels 
of aspiration, overly high judgment discrepancies, is subject to 
response disorganization of the ‘overactive’ type, and tends to be 
of the leptomorph body build. This general syndrome we have 


ulty associations which the older 


factors (neuroticism and extra 
ally defined by our tests and 
In this volume we are co 
only; extraversion-introversi 
Dimensions of Personality, 
nection with it will be ге 
tioned here for two reason 
in connection with level o 
data) enforce at least a b 
the results become almos 


particularly 
f aspiration and response disorganization 


as otherwise 
the general view 


therefore discussed here. 


We may conclude this brief survey by noting that all the 
evidence quoted is strongly in support of our heuristic hypothesis 
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We must now turn to a consideration of the 


imension of ‘neuroticis 


istence of a d 


regarding the ex 


їп personality research. 


f a person's position on the 


Possibility of the exact measurement o 


continuum posited. 


Chapter Four 


THE NEUROTIC DIMENSION: OBJECTIVE 
MEASUREMENT 


T is a well-known principle in scientific procedure that the 
ГЕ and quantitative aspects of a problem tend to inter- 

act and that advances in the qualitative analysis of a complex 
field tend to be followed by improvements in quantitative measure- 
ment, which in turn lead to advances in qualitative analysis. 
Having in the previous chapter established the existence of a 
dimension of neuroticism and defined it operationally, we must 
now go on to discuss the possibilities of an exact quantitative 


measurement, and the reliability and validity of such measure- 
ment. 


external by maximizing the 
former; the external criterion, in turn, i 


tion in this research. The ‘normal’ 


had not less than six months’ service, who were not re-enlistees (as 
far as possible), who had clean disciplinary records, and who, on 
124 


the experimental popula- 
group consisted of men who had 
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the Matrices test of intelligence, had service grades other than 4 
or 5. (This eliminates all those with I.Q.s of go or below.) 

The ‘neurotic’ group consisted of soldiers who were being dis- 
charged from the Army on neuropsychiatric grounds. Each of 
these had been seen by one of four psychiatrists, who diagnosed 
them under the three headings of hysteria, anxiety state, or psycho- 
pathy. No psychotics were tested and the psychiatrists agreed not 
to send patients with service grades of 4 or 5 unless they felt 
on clinical grounds that the patients were more intelligent than 
indicated by their test records. 2 

Certain deviations from the original plan, as outlined above, 
took place which must be noted. Regarding the controls, it was 
found impossible to obtain a large enough sample not contain- 
ing re-enlistees, i.e. persons who rejoined the army after varying 
periods in civilian life. About 50 per cent of the normal subjects 
actually were re-enlistees, and if the assumption is justified that 
persons who rejoin the army in this fashion may be regarded as 
having failed in their adaptation to civilian life, and are therefore 
less stable and less well adapted than the average, then we must 
conclude that our normal group was slightly biased in the neurotic 
direction. Regarding the neurotic group, it was found on analysing 
the data later that the criterion requiring those who took part in 
the experiment to have selection grades of 3 or upwards was not 
fully followed. This led to there being a marked difference between 
controls and neurotics with respect to intelligence, the neurotics 
being very significantly inferior. (The exact data will be given later.) 


« 1T hese men were sent from the Woolwich Garrison, and our thanks are 
due to the Commandant of the Royal Artillery Depot for having given per- 
mission for these men to be tested and for his interest and help in the project. 
The main work in connection with the administration of the project was carried 
Out by Dr. S. Grown. Testing was carried out by Drs. A. Clarke (1950) and 
A. Gravely (1950), whose theses contain a detailed account of their work, by 
1. Standen, Dr. A. Lubin (1951), and С. Yukviss. To all these the writer is 
indebted for their skilled and competent contribution. In the description of the 
groups tested and the general set-up of the experiment the account given by 
Clarke (1950) has been followed very closely, and often verbatim. The number 
of cases used in Clarke’s and Gravely’s theses is smaller than that used in this 
chapter as the research went on beyond the point when these two started 
Working up their results. In consequence some tests attained levels of signifi- 
cance they fell short of with the smaller number of subjects. 

? Our thanks are due to Colonel Pozner for his invaluable help in making 
these arrangements and carrying them through, as well as to the other psychia- 
trists who took part in the selection procedure. 


p 
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Another important reservation attaches to the diagnoses, par- 
ticularly those of hysteria and anxiety states. In the many doubtful 
cases, one important variable factor is the personality bias of the 
interviewing psychiatrist. When there is some doubt as to the 
actual diagnosis, some psychiatrists will label a patient an hysteric, 
who in other hands might be called an anxiety state, and vice 
versa. Among the four psychiatrists who made the diagnoses, 
there existed certain diagnostic tendencies; thus, A had a strong 
tendency to diagnose hysteria, B had a slight tendency to diagnose 
hysteria, C thought that a good many of the cases came within the 
normal limits of emotional reaction to stress, but had a bias for 
anxiety states, and, finally, D when confronted with a doubtful 


case would always call it an anxiety state. The actual tabulation 
of diagnoses, as made b 


these known biases, and consequently it is very doubtful whether 
the differential diag: 

be taken very serio 
these various group 
himself the variati 
diagnostic groups. 
with less force to t| 


с Versus hysteric + anxiety state, and 
the dichotomy :hysteric versus anxiety state. | 


© testing programmes it was impossible to 
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carry out the testing in such a way that every subject completed 
every test. The cause lay partly in the breakdown of apparatus, 
the breakdown of the time schedule, late arrival of transport, and 
similar causes not dependent on the personality of the subject, and 
therefore not likely to have contributed any bias to the testing 
procedure. Together with the data relevant to each test are given, 
therefore, the number of cases in each of the groups involved. The 
first two variables: (1) physical age in years on day of testing, and 
(2) service grade on matrix test of intelligence, will show the way 
in which the data are set out. 


| Point Biserials 
i | б 
(1) Age | Normats | Anxiety | нулата | Pocho |а, e al Се es 
States paths Ratio |^ Ves y, | H.v. А. 
No. of Cases | 207 47 48 25 95 +298 302 95 
Mean Score | 25:5 23:36 | 21-12 | 21:64 | 22:23 === 0257 — '212 
g 554 33 
ariance 36:01 | 31:37 | 23:00 | 19:07 | 28-12 
| 
i 
ET A T Point Biserials 
(2) Intelligence | Normals | AR | нуда | Trier | д. e Н. Gade i NUN 
T ү = | | 4er 1 Ratio |N Yg y | HvA 
a ege EE mE i П " 
1 i 
No. of Cases | 205 38 40 | 24 78 | 283 78 
Mean Score 2:84 410 418 | 442 | 414 d — 481 026 
ariance 0-88 1:50 210 | 243 | 1:78 


These figures show that, contrary to our intention, the normal 
Вгоцр is differentiated from the neurotic group by being slightly 
older (a point which does not greatly matter as the difference is 
€Xtremely slight, although statistically significant), and also a good 
deal more intelligent, as indicated by the higher score on the 
Matrices test. This failure to equate the two groups for intelligence 
has been pointed out before and the way in which it affects the 
results will be discussed later. It is possible that the matrix scores 
Which determine the S.G. grading of the soldiers may not have 
en as valid a test as might be supposed, because in another test 
9f intellectual capacity, the Mill Hill Vocabulary test, the differ- 
nce between the normals and the neurotics, while still significant, 
755 considerably reduced. The figures for test 3, the Mill Hill 
9cabulary, are given below.! 
lA comparison of the ratio of verbal to non-verbal intelligence test results 


for anxiety states and hysterics confirms our previous finding that the ratio is 
ow for hysterics (Eysenck, 1947; Himmelweit, 1945). 
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Point Biserials 


> ра 
(з) Vocabulary | Normals | 4949 | Stee) Pach | 4, & H. Сата | 
States paths Ratio 


No. of Cases | 205 45 47 25 92 7313 
Mean Score | 15:56 | 12:99 | 11-12 | 11:94 | 12:03 SES 


Variance 24:24 | 39:04 | 35:00 | 46:13 | 37-46 


We are now ready to discuss the differences observed with 
respect to the various tests making up the battery proper. The first 
group of tests to be discussed is made up of personality inventories, 
questionnaires, and other written tests of the type described in the 
previous chapter. They are (4) the Crown Word Connection List: 


rd ] ] E К Point Biserials 
4) Wort o e Zerrela- Le 
Connection | Normals Anxiety | нушту; | Pocho- | 4. e H.| “tion 

Ast 


Ratio 


No. of Cases | 189 45 46 25 90 +316 
Mean Score | 12:36 | 15°51 | 17:38 | 16776 | 16-44 | = = 
Variance 34°97 | 4894 | 4097 | 5369 | 45:33 


'The Maudsley Medical Questionnaire: 


Point Biserials 


Correla- 


Anxiety | prysteries | Pocho | A. e H.| tion 


(5) M.M.Q. | Normals 


States paths Ratio p^ ar T H. v. А 
А. Poem 
No. of Cases 195 41 44 25 85 +596 280 85 
Mean Score | 10:30 | 23:24 | 20°44 | 19°04 | 2179 | == | —+596 | —:169 
Variance 42°87 | 52:50 | 83:57 | 67:92 | 69:76 
——À— 


The Minnesota Multiphasic Personality Inventory, Hysteria 
Scale: 


(6) М.М.РЛ, | Normals | ey | Hyueries | Pocho- | 4. en 


No. of Cases 200 45 44 25 89 
Mean Score 13°42 | 20°42 | 18:86 | 18:32 | 19:65 
Variance 26:06 | 42:75 | 49°38 | 48-89 | 46-12 


The Minnesota Multiphasic Personality Inventory; O Score 
(this is the score on those items on the hysteria scale in which 
hysterics are supposed to answer in the same direction as normals): 


14. Maudsley adaptation of choice-behaviour unit. 


16. Dynamometer Persistence test. 


Post Graduate Pesic Training College. Banipurd 


OBJECTIVE MEASUREMENT 129 
(7) MM ELI. Em SW ee Cajita Dee 
EE Normals Ce? Hysterics jus A. & Н. E 
No. of Cases | 200 45 44 25 89 
Mean Score 6-78 | 487 | 518 | 5°08 502 
Variance 12:44 | 12:39 | 11°46 | 11:83 | 11:82 


The Minnesota Multiphasic Personality Inventory; Lie- 


Score: 
E Point Biserials 
(8) Lie Score | Normals | 40989 | Hysterics | Packo |A. еН. “tion 
No. of Cases | 200 45 44 25 89 
Mean Score | 3:06 | 2:80 | 3:39 | 2:56 | 3°09 
Variance 3:96 | 3:48 | 624 | 3:26 | 488 


Annoyances (total number of annoyances picked from a list 
prepared especially for this battery and obtainable from the 
Institute of Psychiatry, Psychology Department): 


Correl Point Biserials 
(о) Annoyances| Normals | Anxiety | нуш: | Pocho | a, Н. “tion Piven] нл. 
States paths Ratio WRC ul Hv. A. 
No. of Cases | 205 | 47 7 24 94 | :161 | 299 94 
Mean Score | 22:82| 26°81 Kä 24°33 | 26:60 —166 | —-org 
ariance 100:85 | 14929 | 105:89 10140 126-26 


Social Attitudes; Emphasis Score (this is the number of items 
оп which the individual agrees strongly or disagrees strongly; the 
inventory of social attitudes used has been published elsewhere 


(Eysenck, 19472) ): 


H. v. A. 


Point Biserials 
(10) Emphasis | Normals | JET | Нунес | Ж | A. е H. 
No. of Cases 189 40 42 19 82 
Mean Score | 15:39 | 1828| 1771| 1995| 1799 
ariance 89:40 | 129-02 | 108-84 | 1 18:05 | 117:30 


Worries (this is a Pressey X-O type of test especially made up 


S.S. p. 


K 
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for this experiment, copies of which may be had from the Psy- 
chology Department, Institute of Psychiatry): 


Point Biserials 
Anxiel; ig | Psycho- Correla- ER 
(11) Worries | Normals | ^ Ste) | Hysteries faths A. € Н.| tion 


Ratio Ж» н.| fev. A 


No. of Cases | 170 44 43 21 87 1272 257 87 
Mean Score | 17-55| 26-34 2463 | 2286| 2549| == —:278 | —:059 
Variance 14923 | 226:37 203:38 | 239:83 213:25 | 


Number of Things Disliked (this is a Pressey X-O type of test, 
the score on which is the number of things disliked. It was especially 
made up for this experiment, and copies may be had from the 
Psychology Department, Institute of Psychiatry): 


| [mer Point Biserials 
Psycho- rela- 

d E HI tion | Nv. 
za Ratio | A GPH.| H. v. A. 


(12) Dislikes | Normals ДШ» нее 


No. of Cases 177 


44 39 2 8 :322 260 83 
Mean Score | 19:11 18"73 | 21:72 ae Se | a — "323 | +116 
Variance 57°53 | 143°97| 194773 | 112-79 167:99 | | 


| Je | 
; | Gand. 
(13) Likes on Anxiety | pp series E аен. Cela 


Point Biserials 


| tion 
States ths | Ratio 
| 


No. of Cases | 184 44 4t | 25 | 85 | 7071 
Mean Score 4152 | 41°50] 4044, 3800, 40°99 | 
Variance 23123 | 207:98 | 201:95 | 261:33 | 202-92 | 

і | 


(greater emphasis) from the normal group. The other exception is 
of much greater interest and importance; item 8, the Lie Score, 
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shows no differentiation between the normal and neurotic groups. 
This finding is of particular importance because the Lie Scale was 
constructed in order to determine whether or not a given subject 
completed the test in such a way as to try and show himself in a 
particularly good light. Such a tendency would be brought out 
very clearly by a high figure on the Lie Score, as has been shown 
experimentally by various investigators. Our finding that normals 
and neurotics do not differ with respect to the Lie Score would 
appear to go counter to claims often made that the normal group 
would be trying to make themselves appear in a good light, 
whereas the neurotic group would not, and that any differences on 
questionnaires such as the Maudsley Medical Questionnaire would 
be due to this tendency. No such interpretation is possible on the 
basis of our data. We may conclude that both neurotics and 
normals tend to give answers to the best of their ability and that 
these answers discriminate at a very high level of significance 
between the two groups. 

It may be of interest to compare the scores of our groups with 
other normal and neurotic groups tested previously to determine 
whether they can be regarded as in any way representative. The 
two tests on which most standardization data are available are the 
Word Connection List and the Maudsley Medical Questionnaire. 
The average score for a normal group on the Word Connection 
List is about 10; for neurotic groups it is in the neighbourhood of 
16. Our data would suggest that while the neurotic group is 
similar to those tested previously, the normal group is somewhat 
less stable than other groups tested, a conclusion well in line with 
the fact that this group included a large number of army re-enlistees, 
as pointed out above. With respect to the Maudsley Medical 
Questionnaire, the scores for the normal and neurotic groups 
respectively are very close to those of previous samples tested at 
Various times. The scores given here may be compared with those 
quoted in-an earlier chapter. On the whole, it seems likely that the 
groups tested in this research are fairly typical of normal and 
neurotic groups respectively. 

he next group of tests to be discussed is made up of objective 
behaviour tests. Four of these, numbers 14, 15, 16, and 17, are 
manual dexterity tests, M, N, O, and P, taken from the United 
States Employment Service General Aptitude Battery. These tests 
are illustrated in several of the photographs in this book (numbers 
19, 20, 23), and will therefore not be described in detail. The 
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results obtained from them are given below, as well as a combined 
total score, number 18.1 


ru Point Biserials 
Я А rela- 
"ET | Noma | ДУ нызы | SÉ | он. од "эм. нча 
Not of Gases} ив | sg | ee | ag 76 | :315 | 244 76 
Mean Score 77°29 | 71°59 | 72:16 7r39 | 71:87 — :298 1033 
Variance 8072 | 5751 | 9697 | 74:34 75°77 
indi Point Biserials 
Dexterit Anxi ies | Po e Die 
(15) Dexterity | y SD | А e | A. eH. qm DP 
No. of Cases 168 39 37 23 76 1335 244 76 
Меап Score | 83-06 78°15 | 74:97 7335| 7558| == “301 —'046 
Variance 101709 | 162:50 178:30 135-06 | 168-27 
Point Biserials 
(16) Dexterity | V. Anxiety WE Correla- 
Test O formals | “States, | Hyteries paths |4 € H. p» Ne "MEL 
— 
No. of Cases | 168 39 7 2 6 E 2 76 
Mean Score | 26:22 2418 | 2 ЗА ара аа ai 266 -mi 
Variance 2018 | 16-68 | 1 "78 | 11-63 17:82 


Point Biserials 
(17) Dexterity Normals | Anxiety Hysteries | Psycho Correla- 


Nv. 
A Qu, | Hv. A. 


No. of Cases | 168 39 37 23 76 б 2, 6 
Mean Score | 25:56 22:77 | 23:03 | 23-00 22-90 3m uH 23) 
Variance 10°68 | 15:29 22:30 | 12:54 | 18:47 
" ir iserials 
(18) Dexterity roe ics | Pocho- |, — | Correla. 1 Point Biseria 

Tests, Total Normals | Sing | Hysterics Pubs |4. ен. fon |x UNE TP 
No. of Cases 244 76 
Mean Score 375 | —'03! 
Variance 


€ official instructions. Results given should, 
therefore, not be used in comparing our groups with others tested on the basis 
of the official instructions. The main difference lay in the curtailment of pre~ 
test trials, and in different wording of instructions, 
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Test 19 is the Leg Persistence test, scored in terms or the 
number of seconds during which the leg is held out over a chair 
without touching down (illustrated in photograph 17). 


Correla- 


(19) Persistence] Normals | АШУ | Нушт | POC | 4. en qm 


A. & H. 


Point Biserials 


H. v. А. 


No. of Cases | 172 38 37 23 75 +148 
Mean Score | 6017| 48:95| 4865| 52°61 | 48:80 
Variance 1184-20 |1036-70 |1612-00 | 983:80 |1302:60 


247 
149 


75 
—:004 


_ Tests 20 and 21 are Speed of Tapping tests, in which the sub- 
Ject is required to tap on a piece of paper with the tip of his pencil 


for 15 seconds. 


Beit Point Biserials 
e ie ses | Psycho- dia 
(20) Tapping 1| Normals | Ame? | Hosteries | POG |A. e H. кА Mr sl 6+4, 
г SES 
No. of Cases | 167 39 37 23 76 7055 
Mean Score 69:76| 70:26| 7r89| 6783| 7r05 
Variance 365:00 | 413:10 | 254-70 | 381-40 | 332:20 
= Á— 
Point Biserials 
XM Anxi vag | Podo- Gorig 
(21) Tapping I| Normals | “дщ? | Hosteries | "ons | A. € H. E rs нед 
No. of Cases 168 39 37 23 76 "071 
Mean Score | 7238| 73:59| 7541| 7043| 74°47 
Variance 332:00 | 297:30 | 181-10 | 349:80 | 238:40 


Items 22 and 23 relate to the amount of scatter produced by 


tapping with a pencil on a sheet of paper. 


Point Biserials 


5 Correla- = 
A sce | Poxh H 
(22) Scatter I | Normals 4 Hysterics paths A. Н. E, mm 
= | }_ 
No. of Cases | 1 6 : 8 6 
72 39 | 397 23 J RER 24 7 
Mean Score | 5843| 93:33) 6784| 49-13) 80-92) —"154 | 7175 
ariance 4073:00 (6338-60 |4217:40 3490-10 40050 | 
Point Biserials 
" Correla- 
(  Anxiel „| Psycho- п 
23) Scatter IT | Normals ‘States | Hysterics paths А. еН. A D DESS 
No. of Cases 
172 | 39 37 23 76 | 235 | 248 76 
Mean Score 54:36| 101-28| 69:73| 43°91| 8592| ^ | ~-196 | —187 
апапсс (4575.60 8558-80 |5374:90 (3606-70 7168-50 
Zad 
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Item 24 is the average scatter on both tfials. Scatter is measured 
in terms of the length of the perimeter drawn around the dots in 
such a way that only convex and no concave lines are admitted. 


Point Biserials 


Correla- 
Me Anxiet; В Psycho- it 
GO Mian Normals | “States | Hysterics paths |^ € H. ro Xx VETT 


No. of Cases | 172 39 37 28 76 217 248 Р 
Mean Score | 56-74 97:69] 6892) 46-96 83:68] —7 —178 | —1 
Variance 4164-80 7234-00 4593'20 [3458-50 |6079:6о 


Numbers 25, 26, and 27 are scores derived from a Speed of 


Decision test, in which two cards are placed in front of the subject, 
face down, and he is asked to say which of the two will be the 
higher. If he is right he obtains one point; if he is wrong he loses 
one point; if he says he feels certain about b 
loses two points respectivel 
and score numb 


ests, and score 27 is the number 
of certainties, (See photograph no. 21.) 


(25) Speed of 


: : Correla- 
cision, | Normals | Ame |н Poe A an Cure 


Ratio |N Yy py | Hove A 


Point Biserials 


No. of Cases | 172 Ge e в 
Меап $соге 348 29 37 3 7 045 


T 3°54 | 3:09 3°50 
Variance 5:59 $04 | 1281 | 2-90 8-71 
Point Biserials 
(26) Speed of Anxi Correla- 
Decision | Normals AMD Hysterics | Роем | 4, e ul Dirk | emen e 
total time States са paths ati 


Ratio |N Yoy pg | Hv A. 


No. of Cases | 172 39 37 23 76 063 
Mean Score 21-24) 2126| 18:92 | 19:61 | 20-12 
Variance 213°45 | 257°56 | 142-80 107-70 | 20043 


int Biserials 
(22) Speed of Seege, ege | ei nen, | =т= Point Biser 
scision— | Norma ysterics УА i 

certainties ы paths Ratio Ni Vis op | Hove 


No. of Cases 172 39 37 23 76 +148 
Mean Score | 8-30 797 | 765 6:96 | 7-82 
Variance 6-81 6-92 729 | 1113 | 703 


Scores 28 to 31 are derived from the Track Tracer test, in 


which the subject is required to trace a path between two rows of 
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holes with a metal stylus; if he touches one of the holes, a buzzer 
rings and an error is counted on an electric counter (illustrated in 
photograph 12). This test is given as a Level of Aspiration test, in 
the way described in Dimensions of Personality (1947), and the four 
scores given are 28—average goal discrepancy for the time scores; 
29—average judgment discrepancy for the time scores; 30— 
average error score; and 31 longest time performance score. 


Point Biserials 


Алчу | Нушпе | Бу | A. en бт, WM, mp 


DeB бош [uh 
'iscre] H 
Total Score | ^ 


No. of Cases | 172 38 37 22 75 148 
Mean Score 1:23 2:13 2:16 *59 2:15 
Variance 6:02 | 30°66 | 10:31 730 | 20°34 


ore Point Biserials 
29) Judgment Anxiet ES п 
Discrepancy: | Normal: mately | Нуметіс | ` 2: А. ё H.| tion 

Time Score | ^ States | 9 pails Rao |, al Five Ae 


No. of Cases | 172 38 37 22 75 +134 
Mean Score | —:59 +16 "51 | —:23 "33 
Уагіапсе 527 | 26:35 | 1415 714 | 20°09 


Point Biserials 


б Correla- 
(30) Mean Anxiety 5 Psycho- 4 
Enor Score | Normals | Staes | Hyseres | “pains |4 GH ub LS EE 
A. & H. ila 
ү 
No. of Cases | 172 38 37 | 22 75 | 167 
Mean Score | 10047 | 112:37 | 111:08 | 108-64 | 111-73 
Variance | 920:30 | 807-80 (1276-60 840-90 |1025:30 
e | LA. s 
] H баа Point Biserials 
(31) Longest. | w Anxiety „| Psycho- sat 
НЫШЫП aen | АШУ | ie | Taar et HR Das ea 


No. of Cases | 172 38 37 22 75 161 
Меап Score | 38:05 | 42°79| 4224| 40°96] 42°52 
Variance 151:20 | 104:17 | 238-02 | 115-76 167:96 


Scores 32, 33, 34 and 35 are derived from the Necker Cube 
Perceptual Reversal test, given the following four ways. First of 
all, the number of reversals during passive contemplation for 
30 seconds (score 32); second, the number of reversals while trying 
to maximize the number of reversals (score 33); third, the number 
of reversals while trying to minimize the number of reversals (score 
34); and lastly, the number of reversals while regarding the Cube 
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passively, without trying to maximize or minimize the score 


(score 35). 


а Point Biserials 
Cube : . rrela- 

SEH Normals | Алісу | уде, Zë: aseo “ton 

passive 


Ratio Ns H| Hv 4 


No. of Cases | 178 41 44 
Mean Score 616 | 5 


Variance 21:65 | 14:38 929 | 23:19 | 11:72 


(33) Cube "e Point Biserials 
bs Anxiety " Psycho- rrela- 

OS | Normals | Ase | нуш] Рова | 4. op py, | Coa 

maximizing Se ES paths Ratio |N Y | HvA 


No. of Cases 178 41 25 8 о 
Mean Score | 1 1:93 | 10:76 BI З S 


Variance 5133 | 49°34 | 100-22 101:97 | 75:42 


int Biserials 
(34) Cube е ` Point 
Кесене Normals | Anxiety Correla: 


Hyteis | Pocos | д en 
minimizing States | nes раа | A. en oodd veg Ev 
—— —| e — 
e eee r EE 
No. of Cases 178 41 


Mean Score 3°67 
Variance 6:58 à 


44 
*66 3:36 | 3-20 3°51 


1343 | 1038 | 258 | 11-73 
ee Il ll 


(35) Cube Ansty Point Biserials 
E Normals States | Hysterics 


A. н.| Hiv A. 


No. of Cases 178 41 
Mean Score | 6:49 6:39 


Variance 19:83 | 28:54 


Scores 36 and 37 ar 
amount of pressure exe 
Drawing test (score 36 
writing of S and S reversed. The amount o 


€ derived from attempts to measure the 
rted on a pencil while doing the Mirror 
» and while doing a test involving the 


Ї 


BU? | Nemate | Araia | ser 
S| 
No. of Cases 187 2 2 
Mean Score 737 Ps a 4 
Variance 2: 


62 ER 2:03 
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Point Biserials 


А : Correla- 

P ? i, | Psycho- i L T EE LOIS 
137 Point. Normals | АНУ | Hyseris | буйы | A. 8 H. fion | A o | Sca 
No. of Cases | 187 42 42 23 84 1167 271 84 
Mean Score | 713 740 | 769 | 6-74 | 7°55 —'132 7095 
Variance 1-81 2:80 2:47 2°30 


1:98 
| 


Score 38 is the body sway in a Static Ataxia test scored in 
t” units; score 39 is the body sway in the Suggestibility test 
(illustrated in photograph 8). 


rae Point Biserials 
(38) Static | Normals | Anxiety | Seier Tode (aen E [ncm "T 
| A. € H. 

No. of Cases | 177 42 46 25 88 "259 265 88 
Mean Score | 5:63 | 8:07 | 689 | узо | 746 | == | —-255 | — 138 
Уагіапсе 6-98 | 26-85 | 10-41 | 26-42 | 99 

Point Biserials 

(39) Body Яй „|! Pl Correla- 

m WILLE Md LA aal rr 
No. of Cases 178 42 47 25 89 NLH 267 89 
Mean Score 14°63 | 23:00] 2170| 2268| 2232| | —+159 | —:023 
Variance 375°67 | 839°07 | 750°56 | 845-31 | 783770 


. Scores 40 and 41 are in terms of answers to questions put at 
the end of the Static Ataxia and Suggestibility tests respectively, 
as to whether or not the subject had felt a tendency to sway or fall. 
Score = 2 points for ‘Yes’. Answer, 1 point for ‘No’. 


Point Biserials 


" P Correla- 
(49) Ataxia, | Normals | ту | нушы] P raven Aen ion a, o 
4. GEH, geg 
No. of Cases 176 42 47 25 89 276 265 89 
€an Score 1:39 1:71 1:66 1:56 1:68 com —+283 | —-058 
апапсе "24 E 223 26 -2 


G Measurement of sway in these two tests was carried out by means of 
à device described in detail elsewhere (Furneaux, 1951) and illustrated in 
Photograph 7. 
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1) Suggesti- 
Wi Lëns 
Question 


Normals | Айу | bry series Tide | A. en 


No. of Cases 176 42 47 5 
Mean Score 154 1:74 1:57 1:60 
Variance "25 20 "25 


Tests 42 and 43 were done with the use of the Luria test, score 
42 being the number of verbal failures in the Association test, 
which constitutes a stimulus for the Luria, and score 43 being the 


amount of right-hand motor disturbance recorded on the kymo- 
graph (illustrated in photograph 15). 


mew Point Biserials 
Verbal P ^ rrela- 
Hes | Normal ш? | Hysteric "mdr | A ен. Ratio |N; Mia py,| Bä 
=! =) 
No. of Cases 22 І л 18 71 
Mean Score ч 409 Us m HUS :089 
Variance 25:06 | 31-48 3374 | 2285 | 32-48 
i int Biserials 
(43) Right- Жый Сот. |__Peint 
P Psych 4 
029 Моюг | Normals | ED | Spee Par |4. ён, m | на. 
ыыы | Mac р A 
No. of Cases 89 30 31 18 61 *452 150 61 
Mean Score 2516| 46:50 48:84| 42-78 47°69 е ES 1048 
Variance 334°84 | 652-47 560-34 659:24 596-92 Н 
| es | 


Scores 44 and 45 were deri 
tions test, in which the subject 


Point Biserials ` 

en Colour Anxiet 22- | Besch Corda | 

Normals Р | Hysterics | Poder | 4 en) Corel a 

emer States paths Ratio GA Veg gz] BEY A. 
| = 

No. of Cases 172 39 37 23 | 76 +196 

Меап $соге 1:37 ` 


I'I о! 
Variance 1:68 7 2 
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(45) Letter | y, mals | Anxiety 


Memory States | Posteries H. v. A. 
No. of Cases | 172 39 37 76 
Mean Score | 15:51 | 12772 | 13°16 7052 


Variance. 16-70 | 18-89 | 1914 


Tests 46, 47, and 48 are parts of a Concentration test in which 
a longeseries of numbers or letters is read out to the subject at the 
rate of one per second. At irregular intervals the series is suddenly 
interrupted and the subject is asked to reproduce the last six 
numbers of letters read out to him. Scoring is in terms of the 
numbers or letters repeated correctly in the right order. Score 46 
is the score on the numbers test, 47 on the letters test, and 48 on 
the numbers and letters combined. 


Point Biserials 
(46) Concen- P x 72278 Correla- 
raton: | Normals States, | Hpsteries | "ZC |4 en БА LO 
No. of Cases 170 39 37 23 76 +200 246 76 
Mean Score | 29:24 | 26:67 | 26:89 | 26:22 | 26-78 | — +183 ‘O17 
Variance 3434 | 41°49 | 51°54 | 4927 | 4578 
Point Biserials 
(47) Concen- is р Psych Correla- 
ion: ls | Ашау | нуд 90 lA GH| 1 
tation: emat шр | Moers gat BL Noa mua 
No. of Cases | 170 39 37 23 76 246 76 
Mean Score | 25:86 | 22:56 | 23:03 | 21:30 | 22°79 +231 +033 
Variance 30:45 | 50°15 | 49°80 | 58:31 | 49:37 


Point Biserials 


(48) Concen- Я Correla- EC LN 
ion: Anxiet; TTD Psyche а 
tration: Normals | ЖЫР | Hysteries | “ping |4 @ H. E M vp wl нча. 
No. of Cases 1 6 Е 246 76 
; 70 | 39 37 23 7 :259 4 
Mean Score 5511| 4923| 4992| 4752| 4957| зо 1028 


Уагіапсе 103:42 | 159:76 | 14658 | 178-08 | 151-42 
I 


Test 49 is a Dark Vision test. The instrument used was an 
A.R.L. Adaptometer Mk. ra, as produced at the Admiralty 
esearch Laboratory. The instrument is illustrated in photo- 
8raph g. It consists of a rectangular box with a lamp at one end 
and a circular diffusing glass viewing screen at the other. The 
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brightness of the screen can be varied by the insertion of sixteen 
graded apertures at a point about half-way between the lamp and 
the screen. The Adaptometer contains a movable filter holder 
which enables the user to insert any one of three filters. Thus a 
very wide range of illumination is at the experimenter’s disposal, 
from 3:961 х 1077 to 7947 х 10-? foot candles. Immediately 
in front of the viewing screen is a metal plate cut into the form 
of a sector which can be rotated into eight separate positions, 
being determined by a clicker mechanism. These positions may 
be described in hours as on a clock, i.e. 12 o'clock, 1.30, 3 o'clock, 
and so on, and the subject is asked to report on the position of the 
sector at each point of illumination of the screen. The apparatus 
battery and a rheostat can be 
mmeter can be made to cor- 
the meter dial. This latter is 


(49) Dark Anxiet 
Vision: Normals | {у 


No. of Cases 141 
Mean Score 13:57 122 
Variance 5:25 


( ype 1200 C). This instrument is 


carried out at the end of the Dark 
ndard ro-minutes as the level of 
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Gets Point Biserials 
DÉNG: | Normals | EE | series | Pocho- | AL en dor xe "cm 
| а. ан rove 
No. of Cases | 154 36 45 23 81 1197 235 ER 
Mean Score | 2589-0 | 2469-4 | 2531-1 | 2504-3 | 250377 "170 "142 
Variance | 59,810 | 20,470 | 68,100 | 10,440 | 47,360 


Tests 51 and 52 are different ways of scoring the Perseveration 
test. The subject is required to write S's as fast as he can for 15 
seconds, and then S's reversed for 15 seconds. Then he is required 
to write S's and S's reversed alternately for 30 seconds. Score 51 is 
the number of symbols written during the first two periods com- 
bined over the number of symbols written in the third period. 
Score 52 is the total number of symbols written during all three 
periods. 


Corre Point Biserials 
(51) Per- " Апхіе) jes | Pod GC 
severation | Normals | "ZC | Hysterics | “ping | A. ЕН. E Mss ay | ae 
— pe 
No. of Cases | 174 40 40 22 8o +032 
Mean Score 167 169 174 1:72 1:71 
ariance "58 +38 28 41 `32 
== sl 
Point Biserials 
(52) Speed of i Р lax 
3 dër? Normals | АЛАНУ | ysterics | Pocho- | 4, e H, tion X 
Symbols iie La Ratio |" A ey p | H.v. A. 
| — m —Àd 
No. of Cases 174 40 40 22 80 254 80 
Mean Score | 61°37! 5570| 5018| 5404| 52:94 252 —-181 
Variance 225:30 | 203°19 | 257:69 | 330-04 | 235°25 
| 


Score 53 is derived from an experiment on the autokinetic 
Phenomenon. The score is the number of seconds before the subject 
Teports movement. 


(53) Auto- 
3) Au " Anxiety jes | Ecke: 
Phenomenon | AM | States | “seri | paths | 


УЗ of Cases | 196 | 37 42 | 23 79 | -095 
Vaan Score | 4441| 5746| 4517 48:35] 5092 
апапсс 2436-76 3524:59 11888-58 2429:87 [2657-56 


& An examination of these objective behaviour test scores shows 
at quite a large number of them differentiate with considerable 
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efficiency between the various groups. Among the more successful 
are those tests which previous work had indicated as being likely 
measures of neuroticism, i.e. the Manual Dexterity tests, Tapping 
Scatter, Static Ataxia, the Luria test, the Abstractions and Con- 
centration tests, the Dark Vision, and the S test (score 52), which 
may be regarded as a test of personal tempo. There is one inter- 
esting divergence from the usual pattern, namely, the Body Sway 
Suggestibility test, which gives a barely significant differentiation 
between the normals and the hysteric plus anxiety state (r = 159). 
This is lower than the differentiating capacity of the Static Ataxia 
test, which gives a comparable correlation of :255. The lack of 
discriminating ability of the Body Sway test is probably due 
to the fact that in this case the suggestion record which had 
previously been used was substituted by a record made on a wire 
recording apparatus by an experimenter whose approach was the 
soft, ingratiating, persuasive, as opposed to the more direct, strong; 
dominating approach used in the ordinary record. It is also pos- 
sible that the electronic method of recording body sway used may 
measure something rather different from the usual direct method. 
Correlations between the two different methods, taken on a differ- 
ene sa mple of 50 subjects, were found to be rather lower than 
anticipated. 

The next group of tests to be discussed consists of measures of 
expressive movement. These have been described in considerable 
detail in another publication (Eysenck, 1951), and only brief 
descriptions will therefore be given here of each test. These tests 
are of particular Interest, in spite of the fact that they universally 
fail to discriminate even at the 5 per cent level between the various 
groups tested, because, as will be shown later, it is precisely these 
tests which discriminate at very high levels of confidence between 
normals and psychotics. The importance of this finding, that 
those tests which give the best differentiation between psychotics 
and normals do not differentiate at all between neurotics and 


normals, will be obvious to the reader; it will be referred to again 
later in the text. 
The first two tests ( 


scores 54-6 : ircles’ and 
"Three Squares’ tests. T [to heh aah 


he subject is handed a clean pi f paper 
‹ piece of pap 
iem Fa. to draw, first, three circles, and, second, three squares. 
о further directions are given and no questions answered. 
Linearity. Scoring 1s as follows: 


The three circles are touching or are concentric and linear, i.c. 
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a straight line can be drawn in such a way as to pass through all 


three circles. Score = 3 points. 
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The three circles are not touching and are not concentric. 


Score = 2 points. 


The three circles are not linear. Score = 1 point. 
The three circles are not scorable. Score = o points. 


T 


(54) 3 Circles: Normals | Anxiety Bësse Pode A. © H. 


inearity States ths 
No. of Cases 172 39 7 2 76 
Mean Score | 1-63 1:56 1:46 1:56 1:51 
ariance 127 | El -26 -26 Ki 


Time taken to draw the three circles. 


(55) 2 Circles: 


Psycho- 


paths |4- GH. 


Ke  Hysterics 


Correla- | 
tion 


Point Biserials 


Nx. 


Ratio |" Vu p | Hove A 
No. of Cases | 172 37 76 071 
Mean Score | 10:01 9:86 10:41 
Variance 63:72 44°56 73°90 
Smallest diameter of the three circles. 
| Doris Point Biserials 
6 ; п TE —= 
(56) Smallest | Normats | Ж? | нулата | Patton | a, enl gm Ns ae 
| АН Set 
Ne. of Gases | 172 | 39 37 23 76 | om 
сап Score | 31-51 | 33:08| 3162] 2739| 3237 
апапсе 339°20 | 232-40 | 414-00 | 211-10 | 31700 | і 
Largest diameter of the three circles. 
Point Biserials 
(57) Largest pe «s Do Correla- 
Diameter | Normals | АШУ | Hysterics им |4.€ H. E 
N 
р: ОЁ Cases | 172 39 37 23 76 *071 


Variance | 41-51] al 4r89| 3565| Aen 
se | 518-20 | 499-10 | 389-30 | 502-20 
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Average diameter of the three circles. 


Point Biserials 


бие |... Eun 
8) Average Anxiety is | Pode- | 4 al COE 2 
5) Ачегаве | Normals | H | Hyserics | Poche Fco еа 


No. of Cases | 172 39 37 23 76 "071 
Mean Score | 36:34| 3718| 3676| 31-74 36:97 
Variance 458-40 | 331-30 | 44470 | 242-30 | 381-40 


Linearity score for the three squares. Scoring is as follows: 

The three squares are touching or are concentric and linear, 
i.e. a straight line can be drawn in such a way as to pass through 
all three squares. Score = 3 points. 


The three squares are not touching and are not concentric. 
Score = 2 points. 


The three squares are not linear. Score — I point. 
The three squares are not scorable. Score — o points. 


Point Biserials 
(59) 3 Squares: Anri > yi 
ө] Squares: | Narrat Ну | Hystertes Podo- |4, & H. 


No. of Cases 172 39 3 2 6 
Mean Score 1:70 1:62 ids 138 i. 
Variance "24 "24 "25 д7 "25 


Time taken to draw the three squares. 


Ried Point Biserials 
0, 3 E Iquares: 


Ван Normals | Anxiety 


‘State. | Hysterics | Psycho- 


4. @ H.| “tion A. 
paths 4 N. v. H. v. 
К apo алани 


No. of Cases 172 39 37 2 6 7095 
Mean Score 13°43 | 15:97 | 12-65 кн e P 
Variance 106:98 | 158-50 89:51 | 65:99 126-07 


Smallest diameter of the three squares. 


Point Biserials 
(61) Smallest Anxiety Psych Correla- 
Diamet Normals Hysterics Doo | A. en tion A. 
iameter States paths Ret: 24 pM H. Н. у. 


No. of Cases | 172 39 37 2 6 - 

3 7 110 
Mean Score 3733, 36-15 | 38-11 2913 | 3711 
Variance 519'10 | 298-00 526-90 | 181-00 40480 
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Largest diameter of the three squares. 


pom Pint Biserials’ 
(62) Largest Anxiety | putain | Pocho- a р 
Diameter Normals States Hysteries paths аён. Rat ay es H. H. v. A. 


No. of Cases | 172 39 37 23 76 по 
Меап Score | 46:10] 4923| 5000| 37:39 | 49°61 
Уагіапсе 838-00 | 72310 | 961-10 | 411-10 | 827-80 


Average diameter of the three squares. 


Point Biserials 


(63) Average нна Anxiety 


; Psycho- 
Diameter States | Poseries casa ur. 


M Ratio (ten) Н.У А. 


E, ЧЕРИ 4 Е 
No. of Cases 172 39 37 23 76 лоо 
Меап Score 41:63 | 43°33] 43°24] 33°91 | 43°29 
Variance 635-90 | 417°50 | 694-70 | 243710 | 545°00 | 

| 


It will be noted that the linearity score for the three squares has 
a significant F ratio and a corresponding correlation ratio of +195. 
Similarly, the linearity score for the three circles is almost signific- 
ant. It is interesting to note that these two scores are the only two 
in the battery of expressive. movements which are insignificant in 
discriminating between normals and psychotics. 

Size estimation of a pound note (the subject is instructed to 
draw a rectangle the size of a pound note on a clean piece of paper. 
The score is the length of the diagonal. All measures are in milli- 
metres): 


7 idis Point Biserials 
64) Diameter: Anxiety | Psycho- on 
"Bound nete | Normals | ^g) | Hysterics | “pains | A: SH ion Mens gi] Hn de 


No. of Gases | 172 39 37 23 76 | 105 
Mean Score | 154-53 | 156767 | 15757 | 14913 | 15711 
ariance 398-00 | 438-60 | 502-30 | 31740 | 463-50 


Size estimation of a half-crown (the subject is instructed to 
draw the size of a half-crown piece. The score is the length of the 
diameter. All measures are in millimetres): 


ze d Point Biserials 
5) Diameter: Anxiety ies | Pocho ‘tion 
Bammer | Normal ‘States, | Hysterics | "ee | 4 FH.) Шм. ave «| H. v. A. 


No. of Cas 
es | 172 39 37 23 76 "110 
vem Score | 35:14 | 36:31 | 3538 | 33°61 | 35°86 
ance 36:28 | 43:22 | 22:58 | 21:07 | 32°95 
| 


S.s.p, = 
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Scores 66 and 67 are taken from the ‘Fewness of Lines’ test. 
The subject is given a clean sheet of paper with two parallel lines 
drawn on it, five inches apart. He is asked to draw lines across the 
sheet between the two lines given, and the time taken (score 66) 
and the number of lines drawn (score 67) constitute his score. 


d Point Biserials 


(66) Time 


Correla- 
taken 


tion " ! 


| 

Anxiety Psycho- | 
|м Men. 
4 ss 

| 

| 


Ka Hysteries | "ZC А. ен 


Normals 
H. v. А. 


E 
No. of Cases | 171 39 37 2 
3 6 
Мец Score | 37-68) 44-03 30:70 | 38-70 | Sisal 
ariance 1206-03 5730-60 | 431-05 1120:58 3155:34 | 


"084 


T Gert 
бей Point Biserials 
Psycho- ота 
+ | A. €? H. tion Nv. H. v. А. 


(67) Number | „, i 
Of Lines | Normals | Age | pras, 


| Rasio | ДӘН. 
i — > = 
No. of Cases | 172 li | 
39 3 2 { g 6 
Mean Score | 14:06 | 13:38 1992 1036 dai 28 SS | Leg 


Variance | 62-08 d 
| a 30 


| 
3324 | 36:13 | 36-39 | 
= | et 


the ‘Waves’ test, in which 
SE о in the four extreme corners 
» pointing respectively downwards or sideways. The 


Scores reported are the average amplitude 
and average wave-length 8 
S over all J re the 
amplitude, draw a four Vis, (To measu 


line across the to id- 
i р of the V and connect m 
point to the bottom of the V. This last line is the amplitude of the 


wave.) Scores were also cal à a 
ately, but as they a calculated for the four sets of V’s separ 


not reported here. 


T - = = 


(68) Average | Point Biserials 


i | 
Amplitudes | Normals | Айу | Guten | Psycho- A e ul ті 
aths Я ^ tion 
D 1 |N. v. H. v. А. 
Ratio y 
A. & H. 
piae 


No. of Cases 172 | 
Mean Score 25:08 239 36 23 75 '095 | 
e 95: 24° p a 
Variance | 25:96 | 3878 | a760 | 2248 3213 | 
| 
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(бп) Average | y Anxiety | prope | Psycho- T Correla- 
Wave-length Normals | "Srt | Steis | “paths 4. & H. Ey 


No. of Cases | 172 39 36 23 75 по 
Mean Score | 100:17| 98:97| 92°50| 9739, 9587 
Variance 509:30 | 70940 | 562-10 | 492-90 | 640-80 


The last five scores are derived from the ‘Measuring Distances’ 
test. The subject is asked to indicate what he considered a 2 ft. 
distance to be by placing two matches 2 ft. apart on the table. 
Next, he is asked to indicate the distance of 1 ft., then one of 8 in. 
The distances are recorded only after he has finished the whole 
test. Scores 70, 71, and 72 are in terms of the distance between the 
two matches in }-inch units. 


( Correla: Ee 
70) Distance: Anxiely PES Psycho- D neo 
if er | Normals | шр | ушпа | Бшш А. ӘН.) fion X on vd 


No. of Cases 159 41 47 
Mean Score | 93:69 | 9220| 94:68 | 93°70] 95:52 
Variance 17019 | 264:96 | 226-14 | 205-49 | 242:94 


| 
(71) Distance: Anxiety Psycho- Correla- 


inte Normals | “Stas | Hysterics paths A. GH Fd Ar wy "m" 


No. of Cases | 159 4t 47 23 88 "071 
Mean Score | 48:37 | 47:10 | 49'00 | 48:26 | 48-11 
Variance 56:17 | 5764 | Grat | 6084 | 6015 


m "mo Point Biserials 

"ide | Normals |  щ | Мше | шу AOH] Жи Куз wa. 
— be 

No. of Cases 159 "m 47 23 88 *100 

Mean Score | 31-33 | 29:93 | 3138 | 30°96 | 3070 

ariance 29:44 | 23:27 | 3207 | 2977 | 2819 


Scores 73 and 74 are the total overestimate and the total 
Underestimate, respectively, i.e. a summation of all positive or 
negative errors respectively in ]-inch units for the whole group 
Concerned, 
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Point Biserials 
(73) Total 
rerestimate 


= Correla- 
Anxiet; D Psycho- & H 
Siue) | Hysterics ZC |4 ен. Em Novy ay | неа. 


Normals 


No. of Cases 159 41 47 23 88 "071 
Mean Score 8:95 8-27 | 11-23 9'61 9:85 
Variance 13488 | 167-20 308:23 | 252:52 | 242-06 


Point Biserials 

(74) Total ] Anxiety x, | Piia Corel Lt 

Underestimate | Normals) "Seel | Номае) Pocho | 4. en Ed N Yy y| Hive As 
= 

No. of Cases | 159 41 47 23 88 077 

Mean Score | 11-58 15°05] 11° 


à 4| 12:70| 13-28 
Variance 261:30 | 328-35 165-76 | 204-68 “16 


[ Another group of tests, namely, the so-called projective tech- 
niques, is represented in this study only by the Rorschach. Scores 
75 and 76 are derived from a group Rorschach test given accord- 
ing to the group method developed by Harrower-Erikson. Score 75 
15 а neuroticism score based on seven of the Miale and Harrower- 
Erikson nine signs; score 76 is based on 15 of Davidson’s seventeen 


signs. Absence of a given neurotic sign was scored + 1, so that the 
Score 15 one of normality. 


т int Bisertals 
(75) Rorschach: ie 4 Point 
g Ё 

айт кыр e Lu 496] ter. 
No. of Cases 157 

35 42 21 a 77 
Меап Score 2:92 2:31 2:38 1-76 at :243 GE 1529 
Variance 2:37 35 


20 | 205 | 209 2:05 


UM cR E ee БИРЕ _.. 
(76) Rorschach: Point Biscrials 


avidson | Normals | Anxiety | oy. | Русо Correla- 
Score States: (ni) “paths | 4 en Raio |N Y y| HvA 
baten we TE 
No. of Cases | 1 | 
Mean Score p EA Gas = 77 2405 234 WU. 
Variance 466 514 | 594 da 


997 | 373 | 2:93 3:80 
Sn H— M Eh Sc ү үш 


Of the measures reviewe 
the purpose of a factorial a 


identify the direction of scoring for each 
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variables; these correlations are given in Table XXV. In view 
of the fact that not every subject had completed every test, it was 
impossible to use the total number of subjects for this analysis. 
Ninety-six subjects had done all except three of our twenty-eight 
tests. Thirty-eight subjects had done all tests, including these three 
(dark-vision, flicker threshold, and the Luria tests). Consequently 
the great majority of the correlations in the table are derived 
from N = 96; all correlations involving the above-mentioned three 
tests are derived from N = 38. As pointed out above, there ig no 
reason to expect that failure of a subject to carry out one of the 
tests was in any way correlated with that subject’s personality or 
Motivation. All the subjects used for this analysis were of course 
normals; no neurotic soldier was included in the correlations. 
We may now proceed to analyse this matrix. Five factors al- 
together were taken out before the residuals could be regarded as 
definitely insignificant. These centroid factors were then subjected 
to criterion analysis. We may regard the ‘Hysteria plus Anxiety’ 
Sroup as our nuclear neurotic group, and in Table XXVI, 
column 2, are given the biserial correlations of each of the 28 tests 
with the normal-neurotic dichotomy, defining neurotic in terms of 
the combination of hysterics and anxiety states.! In order to make 
all these values positive, the signs of 14 tests were changed, as 
indicated in the column headed ‘Multiplier’. It is with these 
changed signs that we will be dealing throughout the discussion 
that follows. 
Column 3 gives the saturations of cach test with a factor 
Obtained by finding that vector which maximized correlation 
petween columns 2 and 3. This factor, therefore, is by definition 
identified as a factor of ‘neuroticism’. The correlation between 
Columns 2 and 3 is a good deal lower than in several other analyses 
reported in this book; where usually the correlation between factor 
and criterion column is between -6 and 8, it is only -400 in the 
Present case. We do not have far to seek for a solution. As pointed 
Sut before, the normal soldiers were considerably more intelligent 
Зап the neurotic soldiers through an error in selection. But intel- 


!&ence has not hitherto been found to have any close relation with 


neuroticism, It follows that the intelligence test (test 2) has a high 
Correlation 


Se with the normal-neurotic dichotomy (r = -481), but a 
pletely negligible factor saturation (r = +109). Unfortunately 


1 Each of the ionsi i 
correlations is based on the largest ossible number of subj 
Who haq completed each test, SES TS? 
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it is not only the intelligence test which is affected in this way, but 
every test which is correlated with intelligence. Thus the low 
criterion correlation appears to be due largely to an irrelevant 
sampling error. Fortunately, this error does not affect the main 
part of our argument, which is concerned not with the establish- 
ment of the fact that a factor of neuroticism exists, but rather with 
the question of whether the diagnosis of ‘psychopathy’ forms part 
of this factor. 

Accordingly we have calculated a new criterion column 
(column 8), made up by correlating each test with a new criterion 
consisting of the old neurotic group (hysterics and anxiety states) 
and the psychopaths weighted equally. Column 9, then, is that factor 
which maximizes the correlation with column 8. If this correlation 
is equal to or larger than that found before (r = :400), then it 
would follow that psychopathy forms part of our best possible 
criterion. In actual fact the correlation drops to 7 — :320, thus 
indicating that psychopathy is not part of the nuclear concept of 
*neuroticism', and hence not part of our best possible criterion. 
This conclusion is not influenced by the unfortunate selection error 
which introduced differences in intelligence between normal and 
neurotic groups, as both the hysteric-anxiety and the psychopathic 
groups are equally affected by this error. In the absence of repeti- 
tion and confirmation of this finding it would be unwise to make 
too definite a statement, but the data strongly support the con- 
clusion that psychopathy is likely to have projections on the 
neuroticism axis, but unlikely to lie on it. We cannot, on the 
basis of these data, say whether the proper position for psycho- 
pathy is in the space generated by those axes already located 
(psychoticism, extraversion-introversion), or whether a new dimen- 
sion will be required to accommodate this particular set of person- 
ality traits. In the absence of factual information, it would be idle 
to speculate. 

We may now turn to certain observations suggested by the 
results in Tables XXV and XXVI. We have only attempted to 
interpret the first factor extracted from our matrix, i.e. the ‘neuroti- 
cism’ factor: the question arises as to the nature of the other factors. 
As the experiment was set up with a definite problem in mind, and 
did not therefore include controls which would facilitate such inter- 
pretation of factors other than the first one, it is with considerable 
hesitation that we suggest the possible meaning of factors two and 
three, i.e. columns 4 and 5. The second factor appears to divide 


152 THE NEUROTIC DIMENSION: 


fairly clearly the verbal tests from the non-verbal, in a manner 
similar to that usually found in the analysis of batteries of intel- 
ligence tests. On the one side we have the flicker fusion, manual 
dexterity, finger dexterity, body sway, leg persistence, tapping, 
abstractions, and concentration tests; on the other the vocabulary, 
word connection list, Maudsley medical questionnaire, worries, 
likes, interests, and annoyances tests, as well as the Luria verbal 
failures and the Minnesota o scale. There are few exceptions to 
this rule, and those that stand out are perhaps not inexplicable. 
Thus the Luria test appears with the verbal tests, which is perhaps 
intelligible in terms of the exclusively verbal stimulation provided, 
and the verbal response required. 
The emergence of this factor lends some support to an heuristic 
hypothesis formulated elsewhere in connection with a discussion of 
recent advances in personality testing (Eysenck, 1950). ‘It is widely 
agreed that Personality rests on a firm hereditary basis, but is also 
subject to great alterations through social and other environmental 
influences. It would appear, by and large, that personality tests of 
the objective performance type are related rather more closely to 
the inherited Pattern of a person's conative and affective traits; 
tests of conditioning, of Suggestibility, of autonomic imbalance, of 
sensory dysfunctioning and of motor expression appear so closely 


p € Structural pr erties 
bound up with th truc ] properties of the nervous syste 


‘On the other ha 
and verbal reactions 
reflect more the hist 
subject to day- 


f onnaire type, would appear to 
orical aspects ‘of a person’s life story and be 
to-day fluctuations of mood and outlook. If we may 
might say that tests of this type 
rather than with those of struc- 
ognitive field also environmental 
connection with verbal than 11 
Further proof for the heuristic 


cen mapped out so well already by factorial 


analyses properly planned for the purpose, there is little point in 
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amplifying this suggestion. Nor will we attempt to interpret the 
other two factors extracted. 

Interesting points are suggested by a comparison of columns 2 
and 3. Apart from the main hypothesis regarding the causes of 
proportionality of these columns, there are certain subsidiary 
assumptions non-fulfilment of which may cause the correlation to 
drop quite considerably. One of these assumptions is that of linear 
regression of each test on neuroticism. Figure 23 will illustrate this 
point. Let the abscissa be the normal-neurotic continuum, divided 
at point Y into two parts, viz. our normal and neurotic groups 
respectively. Let A, B, and C be the regression lines of three tests of 


c 


SCORE ON TEST П 
iS B 


1 
' 
cl 
NORMAL М 
NEUROTICISM CONTINUUM NEUROTIC 
Figure 23 


neuroticism. It will be seen that test C will give a good discrimina- 
tion between normals and neurotics, but show very small correla- 
tion with other neuroticism tests in the normal group. Test B, 
conversely, will show good intercorrelations within the normal 
group, but will not give such good differentiation between normals 
and neurotics. Only test A, havinga linear regression line, will show 
strict proportionality between discrimination and intercorrelation. 

The figures in Table XXVI suggest that questionnaires of 
Various kinds follow the regression line exemplified by test C; it 
Will be seen that for both the Maudsley and Minnesota question- 
naires the figures in column 2 are considerably higher than those 
in column 3. Tests of the Pressey X-O type and the ‘Annoyances’ 
kind, however, appear to follow the regression line of test B; the 
figures in column 3 are considerably higher than those in column 2. 
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The word connection list, the Concentration tests, the Luria ee 
disturbance, and the Leg Persistence test are examples o es ` 
apparently following regression line A. It is difficult to genera ee 
from these data, but it does appear reasonable to rage: E. 
heuristic hypothesis to the effect that verbal tests (both o: а 
direct questionnaire and the indirect cross-out type) have ore 
linear regression on neuroticism, while objective behaviour e 
tend to have linear regression lines. Further research is urgently 
needed to provide more direct evidence on these points. . 
We must now face the question of the validity of a 
batteries of tests which might be constructed from those include 
in our battery. The most obvious method of ascertaining this value 
would perhaps be that of calculating a multiple correlation 
between all our tests and the neuroticism factor, which would thus 
constitute our criterion. This procedure is obviously fallacious; 
while it gives us the satisfyingly high R of almost ‘go, we must 
regard this value with suspicion until the whole battery has been 
validated on another sample of normal and neurotic soldiers. . 
Instead, the following method was used in order to minimize 


у tests 4s 5 14, 16, 19, 39, 42, 47, and 9, 11, 10) 
ed to calculate R for these two sets 
it was thought that in thus making 
d in advance, those chance factors 
е К would have much less scope to 


a choice of the tests to be use 
which tend to inflate multipl 
assert their influence, 


» test 9 was dropped from Set B because the correlation 
between this test and the factor was much higher than was usual in 
ecause we wanted to avoid inflating hs 
usion of a correlation patently too hig d 
two tests from the originally designate 
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batteries may be criticized, it may be noted that the effect on the 
multiple R’s is very small, and on the whole tends to lower rather 
than to raise them. 

The values of R for these two batteries of seven tests, calculated 
according to the Lubin-Summerfield (1951) method, аге 755 and 
‘728 respectively; for the combined batteries R = 817. These 
values may still be overestimates, but it seems unlikely that the 
true validity of either battery would be much below 20. or that of 
the combined battery below -80. As a check on this conclusion, 
another method of calculation was tried. Eighteen tests had been 
selected before completion of the calculations, and on the basis of 
previous experimentation, as most likely to correlate significantly 
with the neuroticism factor. Two of these had to be dropped for 
reasons already given (Body Sway suggestibility and Annoyances), 
so that we are left with sixteen tests—45, 14, 15, 16, 17, 19, 24, 4 
5, 6, 38, 11, 52, 49, 42, 43. A multiple R was then calculated, giving 
equal weights to all these tests, and making no use whatsoever of 
the statistical findings of the present analysis. In this way, both 
selection of tests and weighting system are unaffected by obtained 
results, and the resulting R should give an estimate of the lowest 
possible validity coefficient for our battery of tests. The value of R 
turned out to be -77, a value which strongly supports our previous 
conclusion. Any reasonable system of weighting based on previous 
work, and not using the figures derived in the present research, 
would raise this figure to Do or over. It seems reasonable, then, to 
conclude that the battery of tests here assembled has a validity of 
approximately -80 or thereabouts: it will be shown later that a 
Similar battery has a reliability of at least -85, and probably above 
‘90. These figures strongly support the argument advanced jn 

imensions of Personality that in neuroticism we are dealing with 
à personality factor which can be measured as reliably and as 
validly as intelligence. In further support of this contention, a brief 
description will next be given of two experiments applying similar 
methods to those described to children. One of these experiments 
deals with objective and verbal tests, the other with the Rorschach. 

Hitherto we have been dealing exclusively with adults, because 
Most of the work done on emotional stability and neuroticism has 
been done with subjects over eighteen years old. If our general 
hypothesis of neuroticism as a kind of constitutional weakness be 
Correct, however, it would appear essential that experiments with 
children, using analogous tests to those described above, should be 
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shown to give similar results. From the practical point of view, too, 
the measurement of liability to breakdown in children is probably 
far more important than such measurement in the case of adults; 
preventive work is more likely to be successful at a relatively early 
age, and the possibility of ‘screening’ children who need psychiatric 
attention is a very inviting one. Two comparable studies have been 
reported from our laboratory, one using objective behaviour tests 
and pencil and paper tests, the other using projective material. 


In the first of these studies (Himmelweit & Petrie, 1951), 50 
normal and 50 neurotic children were matched on sex, age, and 
Г.О., and tested for five hours by means of individual tests and for 
three hours by means of group tests. Selection of normal children 
was carried out by asking head- 

for testing who appeared relative 
no obvious signs of emotional 
wardness. Selection of neurotic 
choosing consecutive admission: 
Clinic. These ‘neurotic’ childr 
Psychiatrist on a five-point sc 
Were rated well-adjusted, and seven as only mildly maladjusted. 
It will be seen that the neuroti 
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Fourteen tests showing the best discrimination were selected, 
and a discriminant function analysis performed. The method used 
was that originated by Penrose (1947) which gives a close approxi- 
mation to the linear discriminant function when all intercorrela- 
tions are approximately equal. This method of discrimination 
requires the computation of a ‘size’ and a ‘shape’ score for each 
person. These scores are respectively the sum of the individual 
standard scores of each test, and the sum of individual scores suit- 
ably weighted, the weight being proportional for each test to the 
observed difference of the two group means for that test. The shape 
and size scores tend to be uncorrelated, and the linear function 
that best differentiates between the groups (i.e. reduces the number 
of misclassifications to a minimum) is then determined by calcu- 
lating the multiple regression of the normal-neurotic dichotomy 
on the two scores. 

Figure 24 shows the scatter diagram produced by plotting each 
child's score against the ‘size’ and ‘shape’ co-ordinates; each dot 
represents a neurotic child, each cross a normal child, while the 
diagonal line drawn through the cluster of points is the linear 
discriminant. It will be seen that nine nialadjusted and ten well 
adjusted children are misclassified according to the criterion, giving 
a total misclassification of 20 per cent. This should be compared 
With a chance level of misclassification of 50 per cent. The quad- 
ratic discriminant was calculated, avoiding the assumption that 
the variance-covariance matrices within each group were the same, 
but did not aid significantly in correct classification. 

The multiple correlation coefficient was found to be + -70, 
a value which presumably capitalizes on chance errors to some 
extent, and is subject to shrinkage in future applications of the tests. 

Owever, it was found that ‘neuroticism’ scores calculated on this 
basis correlated significantly with the psychiatrist’s ratings for the 
neurotic children, thus showing validity when applied to a sample 
which had not determined the actual weights used. 

The second of these studies is very similar in design and treat- 
Ment to the first one (Cox, 1951). Subjects were divided into an 
experimental group of sixty neurotic boys attending the Maudsley 
Child Guidance Clinic, twenty of whom were aged from 8 years to 
9 years 11 months, twenty from 10 years to 11 years 11 months, 
and twenty from 12 years to 13 years 11 months, and a control 
Broup of ‘normal’ boys matched for age and intelligence. An indi- 
Vidual Rorschach test was given to these children, as well as an 
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individual intelligence test of the non-verbal type (Progressive 
Matrices). In scoring the Rorschach test, Beck’s (1944) scoring 


categories were used, with minor additions from Klopfer and Kelly 
(1942). 
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Twenty-five indices were selected from the much larger num- 
ber scored, and Intercorrelated. Correlations were also formed 


between these variables and the intelli ) 
table), and between the variabl igence test result (LO. in the 


es and the normal-neurotic di 
etw | otic dicho- 
tomy (the criterion). Five factors were extracted from the resulting 
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matrix, which is reported in full elsewhere (Cox, 1951). The 
saturations for these factors of the twenty-seven items—twenty-five 
scoring categories, І.О), and criterion—are given in Table XXVII. 
Also given are the correlations between the various items and the 
normal-neurotic criterion (cf. column ‘D’.) The first value in the 


TABLE XXVII 
| i | di HI | IV V D 
he Criterion | 265| 734| 208 | —196| —121 | (83) 
" 84 | —171 —441 | —220 | —139 | —12 
3. Ad. 360 503 | —305 | —184 226 43 
4. Hds. | 403 562 | —182 127 218 49 
5. Fire 392 | —378 189 | 089 | —178 | —о8 
6. Geol. & mtns. 474 | —188 342 ! 397 084. 03 
^ Water 441 | —505| 174 | -056 | —037 | —10 
» Arch 477 072 598 | —110 | —084 2 
9. Mech. Sci. 362 | 398 | 417! 084 | —273| 50 
10. Misc, 140 | —327 | 252| 075| 267| —47 
1. Ws, 1881 1541 478 142 | —332 25 
12. D, 825 153 | —183 | —216 139 48 
13. Dd. 602 257 | —278 257 | —050 25 
M EE 942 | 154 | —268| 095 | —o89 | 34 
SC F+'s B41 261 | —o94 058 | —305 40 
e * Pu 5 661 | —o69 | —264 337 382 | —03 
nt 731| 435| —289| 142 | —157| 40 
XO TEIG 501 193 | —330 197 | 508 18 
et ХЫ 426 | —128 | —129 047 105 оо 
nO All Säi, el = 
: —560| 05 19 | — = 
x AT 575 | -47s | 276 | -363 | 189| о 
SC ne 030 254 512 257 | —084 32 
Pap um 897 | —o63 | —354 | 149 | —177 | 21 
a ^w time —448 461 | 167 250 365 | —o8 
du pL HE —451 420 | 161| 337 489 | —03 
7. Failure —811 | —297 | —036 | 516 | —246 58 
Decimal point properly preceding cach entry omitted. 
'D 


? column, which has been put in brackets, is the square root of 
the communality for the criterion, and is an estimate of the degree 
to which the total battery has succeeded in measuring the trait 

underlying the dichotomy, i.e. neuroticism. 
Th he interpretation of these factors is relatively straightforward. 
id first factor appears to be one of productivity or number of 
ense, having its highest loadings on ‘total number of responses’ 
аш, апа total number of responses involving the use of form (942), 
its highest negative loading on ‘failure to respond’ (—-81 1). 
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The second factor appears to be the neuroticism factor pne 
would expect to find very prominent in such a sample of children; 
it has its highest loading in the normal-neurotic criterion (+734) 
and is distinctly proportional in its loadings to the D column. The 
third factor appears as one of intelligence, with high loadings 
on I.Q. and cultured and unusual responses like architecture, 
mechanical and scientific, and geology. 

The fourth factor depends for its interpretation so much on 
Rorschach mythology that only the most tentative explanation 
seems in order. It appears to contrast the individual who meets 
the environment cither with a quick, often poor response, or 
violently, by emotional outbursts and point-blank refusal to co- 
operate, with the individual who experiences the environment 
without attempting to control it, being too absorbed in his personal 
difficulties. “This interpretation is based on the implications in 

Rorschach terms of the Geology and Mountain response, which is 
suggestive of superficial evasiveness, common-place stereotypy 
and the pure Colour responses indicating crude emotionality. These 
characteristics may be conceived as opposed to the feelings of 
inferiority, anxiety, depression, and more generally apathy and 
passivity implied by the chiaroscura and vista responses.’ The 
similarity of this ‘dynamic’ interpretation to the introvert-extravert 
dichotomy is obvious, but in view of the hazards of such inter- 
pretation no stress is laid on this factor. No interpretation is 


attempted of factor V, which may be entirely a statistical artefact. 
Having thuss 


а hown the presence ofa strong factor of neuroticism 
in the test data, 


s ^, a discriminant function analysis was carried out for 
the maximal differentiation of th 


used will be found elsewhere ( 
in Figure 25, in a form com 


the objective behaviour 
on between factor scores 

lue of +594 when all five 
factors are used, and ‘566 when only the first two factors are used. 
(This difference is not Statistically significant.) Again, it is probable 
that these values overestimate the size of the correlation one might 
iment, and that the values 
Ver, it seems unlikely that 
1 n of error; indeed, the high 
saturations which the normal-neurotic dichotomy rating received 
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in the factor analysis, i.e. in a type of analysis in which no such 
maximization of error takes place, suggests that our figures err if 
anything in underestimating the possible usefulness of this test for 
the detection of potential neurotics. It should also be borne in 
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mind, of course, that the criterion itself was far from perfect and 
that higher validity coefficients could be expected if some way 
Could be found to purify the criterion. 

This analysis raises important theoretical points. Projective 


*Xperts and Rorschach analysts will undoubtedly protest that by 
S s.p, M 
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thus treating the Rorschach as an ordinary psychometric device, 
violence is done to the underlying principles of this technique. 
They will point out that when a record is being analysed, 8 
meaning of a given score changes subtly in accordance with the 
context in which this score is found, so that we must never treat a 
given response ‘analytically’, i.c. in isolation, but must sce it as 
part of a gestalt. Failure to do so robs the Rorschach test of its 
most distinctive features, and falsifies the picture which may be 
obtained from it. 

The atomist’s reply would of course be along familiar lines. He 
would point out that it is impossible statistically to get better 
prediction than that given by the multiple correlation formula, 
and that any departure from the weights given to any particular 
score by the regression formula according to the dictates of ‘insight’ 
or regard for the total gestalt can only lead to a lowering of pre- 
dictive accuracy. The large-scale experiment on selection of clinical 
psychologists at Michigan provides empirical evidence on this 
point; it will be described in some detail elsewhere in this book, 

In the present case, a very simple experiment suggests itself to 
test the validity of the theoretical position taken by the gestaltist. 
Given the criterion of allocating a group of children to either the 


normal or the neurotic side, we have already shown with what 
Success this can be accomplished through the use of factorial 
analysis. Now if the gestaltist claim is justified, we would expect 
expert judgment based on the total record, and subject to no 
restrictions whatever, 


to be superior to the predictive accuracy 
achieved by the formal statistical analysis. Here indeed we would 
seem to have a crucial experiment on this very important theor- 
etical point, and consequently two experienced Rorschach workers 
were asked to sort the hundred and twenty records into those they 
regarded as ‘definitely neurotic’, ‘probably neurotic’, ‘probably 
normal’, and ‘definitely normal’, (For various reasons, the total 
number of records in the results given below is not always 120, and 
consequently a statement of the actual number involved in each 
case is given. One of the two Rorschach experts was given a chance 
to look at a sample of records for the sake of getting familiar with 
the particular local characteristics of the children; these are of 
course not included in the experiment.) 


‚ Below are given the results for expert M (Table XXVIIA). It 
will be noted that out of 116 children, 58-6 per cent are correctly 
identified, a result which is significant at the 


5 per cent level when 
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the one-tail test is used (¢ = 1:764). ‘Definitely’ neurotic or normal 
identification is no better than ‘probably’ neurotic or normal 
identification. The correlation between prediction and actual status 
is 17. These results are hardly encouraging. 


TABLE XXVIIA 


А Actual Classification 
Classification 
by Expert M | 
Neurotic | Normal | Correct 
| % 
Definitely neurotic 15 10 60 
Probably neurotic 19 | 14 58 
Probably normal 15 26 63 
Definitely normal 9 | 8 47 


Results from the other Rorschach expert show that of 96 chil- 
dren, 23 were correctly classified as normal and 23 as neurotic: 
25 normal children were misclassified as neurotic, and 25 neurotic 
children disclassified as normal. The result, it appears, is slightly 
though not significantly worse than might be expected by chance. 
(It should be noted that when this expert sorted protocols without 
having to pay attention to the condition that 50 per cent of the 
children were normal and 50 per cent neurotic, he obtained very 
much better results, chi squared being 10:7. Itis difficult to account 
for this difference.) 

No finality is claimed for this comparison between the atomistic 
and the gestaltist methods of approach, nor do we claim that the 
Overwhelming superiority of the atomistic method would neces- 
Sarily be found in future work. This particular comparison was 
Introduced as an afterthought, and did not form part of the original 

esign of the experiment; consequently certain controls are lacking 
which would be essential if the results were to be regarded as in 
Sech sense definitive. It is included here mainly for the purpose of 
Pointing out that theoretical points of view, such as those of the 
atomist and the gestaltist, can be settled on the empirical level, 
and are capable of being submitted to controlled experimentation; 
twenty years of verbal argument have not produced the consensus 
9! opinion in this field which ought to follow a series of well- 
E ри studies. It is, therefore, the method rather than the result 
in Ce research on which we wish to lay stress. Those who believe 
€ gestaltist approach, and want to remain in the realm of 
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science, must sooner or later submit their hypotheses to a E 
experimental test of the kind here outlined; semantic argumen 
along well-worn lines is not likely to make their point of ass 
acceptable to experimentalists who look rather for empirica 
demonstrations. . 
While these results and considerations make it appear feasible 
to consider techniques such as the Rorschach in terms of orthodox 
test construction, application, and validation, it should be stressed 
that in many respects these techniques fall behind legitimate 
requirements. Let us consider only one of these requirements, 
namely that of reliability. The literature here is in such a confused 
state that it is almost impossible to derive any agreed conclusions. 
Thus in one breath Rorschach experts claim that such devices as 
the split-half technique for measuring Rorschach reliability are 
inapplicable because the test must be considered as a whole, while 
in the next breath they quote with approval such applications 
of the split-half technique as have given acceptable values for 
Rorschach reliability. If the original objection to the use of the 
split-half method has any force, then surely the demonstration 
that nevertheless high reliabilities can be obtained must be taken 
to invalidate the hypothesis which has led to the original objec- 
tion. This is but another example of the ad hoc use to which theories 
are being put in current clinical work, so that even when deduc- 


tions made from these theories are shown to be falsified, this does 
not lead to any modification of the original theory. 


However, if we take theoretical arguments against the use of 
the split-half method seriously, then we must fall back on the 
'equivalent tests! procedure for measuring reliability. Fortunately 
there is a parallel form of the Rorschach test, constructed in such 
a way as to be strictly comparable to it, namely, the so-called 
Behn-Rorschach (Zulliger, 1941) or Bero test. The Bero and the 
Rorschach tests were given, in properly counterbalanced order, to 
one hundred normal and ninety-six abnormal subjects, in an 
attempt to obtain evidence on the reliability of the Rorschach 
Scoring categories (Meadows, 1951). This procedure would seem 
to be unexceptionable from the point of view of Rorschach theory, 
provided the two versions are indeed strictly comparable. 

The crucial test for the assumption of comparability consists in 
a comparison of the means and sigmas of the scoring categories of 
the two tests. If these are not significantly different, then clearly 
the two tests measure exactly the same personality areas; if they 
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are significantly different, then it is impossible to maintain the 
hypothesis of comparability. Table XXVIII gives means and 
sigmas for eight scoring categories (R, W, D, M, F, C, A, and P), 
for the normal, abnormal, and total groups respectively. Com- 
parison shows the close similarity of the figures for the two tests. 
In addition to the 8 scoring categories quoted, another 27 were 
in actual fact compared, for normals and abnormals separately, 


TABLE XXVIII 
MEAN RORSCHACH SCORES AND SIGMAS, AS COMPARED WITH MEAN 
BEHN-RORSCHACH SCORES AND SIGMAS, FOR NORMAL, ABNORMAL, AND 
TOTAL GROUPS 


Scoring Category | XR XB oR cB 
N 24:40 25:62 12:17 11:25 
R {А 19:00 19:83 10:26 10:16 
st 21:80 22:76 11:59 1110 
N 8-02 8:24 3:36 3:58 
W 4A 6:31 5:60 3:24 2:74 
m 7:20 6:94 341 3:46 
N 15:05 16-06 10°07 9:26 
D {a 11:50 13°10 8-65, 8:47 
T 13°40 14°55 9:61 gor 
N 2:08 1:82 1:44 1-68 
M {a “99 112 121 DIUI 
T 1:55 147 144 1:53 
N 10:24 10:94 8:28 6:53 
F {А 8-25 9:15 6-31 6-04 
T 9:80 10:01 7°53 6:36 
N 1:56 1:23 1:44 1:31 
с {a -60 -63 1:02 1:07 
T 1:09 93 1:34 1'24 
N 7:89 10:08 415 4:68 
A {a 6:75 8:22 3:94 434 
T 733 9°15 409 461 
N 4:48 423 1:62 1:69 
P {a 3:38 3:37 67 ‘97 
H 3:99 3:80 1:65 1:66 
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making a total of 70 comparisons. Out of these 70 comparisons of 
values for the means, three were significantly different at or be- 
yond the 5 per cent level. This is roughly what would be expected 
by chance (it is impossible to give an exact value to the chance 
hypothesis as the scoring categories are of course not independent); 


consequently there is little reason to assume that the two tests are 
not comparable. 


TABLE XXIX 


RELIABILITY COEFFICIENTS (RORSCHACH У. BEHN-RORSCHACH) FOR 
NORMAL AND ABNORMAL GROUPS 


R 
Category Ns A: | Category М: 4: 
R (f) 83 86 ЕМ (£) +36 “54 
F (f) -78 т Art (c) “35 51 
на (с) RO 41 cl (f) 34 “52 
D (f) 74 81 m (f) "33 '52 
Ad (c) 710 '54 AAt (c) EN “62 
S (£) “69 40 [e (f) ‘30 58 
A (c) 63 “Bo FK (f) 30 “35 
FC (f) Do 25 Dd (f) 28 62 
H (c) 57 '57 A Obj. (c) +26 57 
w (| 53 бо РІ. (с) +26 43 
M (f) 152 :бо Blood (с) -20 
Dis (f) "80 45 F- (£) оо “59 
Obj (c) 50 E Sm. Cl. (c) 48 -24 
K (f) 47 58 K (f) "m :33 
Geog. (c) 45 52 c (Р) “13 E 
SC С 39 '65 ire с —o e 
P (| 536 54 ә > Y 


f — formal score 
с = content score 


^ € correlations between the two tests, for 
all 35 scoring categories, and 


(customarily regarded as an acceptable level of reliability), and 
only the R category, i.e. the total number of responses, gives à 
correlation of over -80 for both normals and abnormals. One 
1 It will be noted that the order of size for the reliabilities is very similar for 
the normal and abnormal Eroups, but that the reliabilities for the abnormal 
group are fairly consistently larger than those for the normal group. 
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further category (F) gives a correlation of over -7o for both normal 
and abnormal groups, and one category (A) gives a correlation of 
over :бо. Five correlations (Ad., Н, W, M, and Obj.) are over +50 
for both normals and abnormals. A considerable number of cor- 
relations is in the low twenties or below, and some are actually 
negative! These results should be borne in mind by those who base 
their interpretations on minute differences in Rorschach scores; 
the possibility that these differences are merely fortuitous, and of 
no fundamental importance, cannot be disregarded. 

Indeed, if we were to advance an hypothesis to account for the 
superiority in the experiment described on page 163 of the statistical 
Over the intuitive approach, it would be in terms of the tendency 
of the expert to pay attention to, and base his interpretation on, 

. minute idiosyncrasies, odd phrasings, and slight deviations from 
the normal which have little reliability, and consequently little 
validity, while giving less weight to the more reliable and valid 
Scoring categories which are picked out by the factorial—statistical 
approach. The same hypothesis has been advanced by Kelly and 
Fiske (1950) to account for a similar failing of the projective- 
intuitive approach in the large-scale Michigan experiment into 
the selection of Clinical Psychology students. "The essence of 
Clinical evaluation and integration of data involve permitting the 
clinician to assign to each item of opinion "beta weights", which 
vary from case to case according to the clinician’s perceived pat- 
terning of the data. Our findings suggest that this technique may 
result in increasing the ratio of error variance to true variance 
With. successive ratings based on increments of information. This 
may lead to a subjective feeling of increased knowledge about the 
assessee without a parallel awareness of the fact that many of thc 
additional items of information are not actually correlated with 
the criteria, and hence should not be weighted in arriving at a 
Prediction about the assessee.’ 

This chapter has shown us that at least one dimension in the 
Non-cognitive field can be measured with sufficient accuracy to 
make it possible to proceed to more formal experiments on the 
Causation and the correlates of ‘neuroticism’; we must now turn to 
а detailed consideration of the available evidence in these fields. 


Chapter Five 


HEREDITY AND ENVIRONMENT 


in the determination of an individual’s personality. If we accept 

- the well-known definition of personality as ‘the integrated organ- 
ization of all the cognitive, affective, conative, and physical char- 
acteristics of an individual as it manifests itself in focal distinctness 
to others’, we might expect that much research endeavour would 
have been dedicated to the discovery of hereditary influences on 
the cognitive, affective, conative and physical characteristics of 
the individual. A certain amount of such research there has been, 
but its emphasis has been curiously lopsided; we have some studies 
into inheritance of physical characteristics, and some into inheri- 
tance of cognitive abilities, but there has been little worth-while 
research into the conative and affective sides of personality. 

The most favoured method of investigation has been the so- 
called ‘twin method? developed in Germany (Siemens, 1924), 
which consists in comparing the average resemblance of identical 
twins with that of fraternal twins. The difference between identical 
twins, due to environment alone, is compared with the difference 
between fraternal twins, due to both heredity and environment; if 
differences between fraternal twins are much greater than between 
identical twins, heredity appears to be a powerful causative factor, 
while if differences are small or non-existent, the influence of 
heredity as a causative factor in individual differences is discounted. 
It is possible to give mathematical expression to the estimated 
contribution of heredity and environment to the variance of any 
given test, as well as of the interactio 
(Shuttleworth, 1935) 
tion that the environ 
a pair of identicals. 

The large amount of re 


Г is commonly believed that heredity plays a considerable part 


n of heredity and environment 
; provided we are ready to make the assump- 
ment is as similar for a pair of fraternals as for 


. search done along these lines into the 

inheritance of intelligence has been summarized adequately by 

Verschuer (1939), Schwesinger (1933), Newman, Freeman, and 

Holzinger (1937), Gottschaldt (1939), and Woodworth (1941)- 
8 
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The fairly universal conclusion has been that ‘inter-family environ- 
mental differences account for a much smaller proportion of the 
variance in intelligence than do hereditary differences’ (Shuttle- 
worth, 1935). No equally comprehensive generalization has 
hitherto been made in the affective and conative fields, if we except 
the rather negative conclusions arrived at by Newman, Freeman, 
and Holzinger (1937) who say: ‘In most of the traits measured the 
identical twins are much more alike than the fraternal twins, as 
indicated by higher correlations. This is true of physical dimen- 
sions, of intelligence, and of educational achievement. The only 
group of traits in which identical twins are not much more alike consists of 
those commonly classed under the heading of personality." . . . The differ- 
ence in resemblance of the two classes of twins is not the same in 
the different groups of traits..In general, the contrast is greater in 
physical traits, next in tests of general ability (intelligence), less in 
achievement tests, and least in tests of personality or temperament. 
In certain instances, viz. tapping, will-temperament, and neurotic 
disposition, the correlations of identical twins are but little higher 
than those of fraternal twins.’ 

A brief review of such experimental and observational data as 
are available will indicate some of the reasons for this failure to 
achieve positive results comparable to those achieved in the cogni- 
tive field, and will also make us familiar with certain dangers in 
twin research which have invalidated many conclusions confidently 
drawn from methodologically inadequate data. 

_ Much the most extensive work on the inheritance of person- 
ality has been done in the field of mental illness, where twin studies 
of psychotic (and occasionally neurotic) patients have been widely 
accepted as a method for elucidating the influence of heredity on 
pathology. The work of Rudin (1930), Rosanof et al. (1935, 1941), 
Essen-Moeller (1941), Luxenberger (1928, 1930, 1933, 1934, 1935; 
1940), Kallmann (1941, 1946), and Kallmann & Barrera (1942), 
leaves little doubt that concordance in identical twins is consider- 
ably more frequent than in fraternal twins, although still far from 
Perfect. This type of work inevitably suffers from the subjectivity 
of psychiatric diagnoses, and the lack of reliability associated with 
all rating methods. 

. A quite different field, which also has attracted some attention, 
1s that of criminal tendency. Lange (1931), Stumpfl (1936), Kranz 
(1936), and Borgstroem (1939), have shown that identical twins 


1 Italics not in original. 
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are more concordant in their criminality than are fraternal SC 
even when complete separation has taken place. Popenoe Des 
concludes from a review of this material that we must ascri S o 
heredity a more important role in the production of crime than has 
hitherto been the case’. | 
When we come to the more definitely experimental type of 
study, we find that the relevance of the experiment to the и 
of heredity of personality is usually contingent on the theory Oo 
personality held by the investigator. Conditioning experiments 
appear important to the followers of Pavlov; eidetic imagery to 
those who agree with Jaensch's system; handwriting is studied by 
the ‘expressive movement’ school; autonomic patterns are investi- 
gated by the physiological experimenters; perseveration and 
fluency by the Spearman school; 
employed by the followers of Rorschach and Freud. 


(1940a and H 
handwriting 


(19350) 
and Miguel (1935) 


came to the opposite conclusion after experi- 


of twins each. Writing speed was found to be 
by heredity in Bracken’ 


author (Bracken, 19404) 


(1939) con- 
ducted matching experi 


ments which showed hereditary factors to 
be prominent in handwriting, while Hartge (1936) found hand; 
writing to be of no diagnostic value in individual diagnosis of 
monozygoticity. Nicolay (1939) and Hermann (1939) agree in 
finding little hereditary determination in handwriting, with ‘the 
exception of the writing angle. Saudek & Seeman (1932, 1933) 
emphasize both heredity and environment in the determination of 
writing and drawing. Newman, Freeman, & Holzinger (1937) 
support Galton’s original finding, that there is surprisingly little 
resemblance between handwritings of identical twins. The onc 
exception to this appears to be the quality of the handwriting 
(cf. also Kramer & Lauterbach (1928) J: 
Closely related to handwritin 
of the Downey Will-Temperame 
on handwriting characteristics. 
conclusions, while В 


g studies are three investigations 
nt test, which is based essentially 


‹ Tarcsay (1939) reports negative 
akwin (1930) is more positive. Newman, Free- 
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man, & Holzinger (1937) report findings which definitely disprove 
the hypothesis that heredity determines individual differences in 
reaction to this test; the intraclass correlations on the four scales of 
this test used by them are higher for fraternal than for identical 
twins! (The actual values аге -69, -36, :53, and :51 as against -45, 
“31, *51, and -48.) 

Perceptual factors in the personality of twins have been studied 
by Bracken (1939c), Hofstetter (1948) and Smith (1949). Eidetic 
imagery, which according to Jaensch's theories is closely connected 
with personality type, was shown by the last-named author to be 
Strongly determined by hereditary factors, as was also reaction 
time. The other authors show that in the production of after- 
images, in the extent of visual illusions, and in accommodative 
convergence heredity plays a powerful part. 

A beginning has been made in the study of conditioning in 
twins by Kanaev (1934, 1938, 1941), who used the Krasnogorsky 
modification of Pavlov's method to show remarkable similarities 
between identical twins. He links this approach to Pavlov's theories 
of personality, which are based on conditioning experiments, and 
itis unfortunate that his suggestive work has been left in a relatively 
undeveloped state where no certain conclusions can be drawn. 

Another physiological variable which has been studied on the 
hypothesis that it might be found to be correlated with personality 

ifferences is the brain-wave pattern. Lennox et al. (1945), Elm- 
&ren (1941), Davis & Davis (1936), and Gottlober (1938) have 
Shown remarkable similarities between identical twins, and less 
marked similarity between fraternal twins; in the absence of any 
satisfactory theory linking brainwave patterns to personality, how- 
Ever, it is difficult to interpret these findings. 

Of more definite relevance to personality are three studies of 
yet a third physiological variable, viz. the psycho-galvanic reflex. 
: he important monograph by Wenger (1948) has definitely estab- 
lished the close connection between neuroticism and autonomic 
imbalance, and has shown that the P.G.R. is a good measure of 
autonomic imbalance. Carmena (1934, 19354), working on Go 
Pairs of twins, showed that the P.G.R. is strongly influenced by 
heredity, Jost & Sontag (1944) supported this finding by using, in 
addition to the P.G.R., pulse pressure, salivation, heart period, 
respiration rate, vasomotor persistence time, and other autonomic 
Measures. They conclude that an autonomic constitution may be 
at least partially inherited. 
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Ratings along the lines of the Vineland Social Maturity Scale 
have been carried out by Bracken (19396), Troup & Lester (1942), 
and Wilson (1941); there is considerable agreement that whatever 
is measured by this scale is influenced to a considerable jen 
by heredity. While it is believed that the Social Maturity Scale 
measures personality factors additional to intelligence, it has not 
hitherto been possible to identify them, and therefore it is difficult 
to interpret the findings. 
Motor skill has been studied by Brody (1937) and by McNemar 
(1933). Both authors find identical twins much more alike than 
fraternal twins, and conclude that the hereditary hypothesis is the 
most plausible explanation of individual differences in motor skills. 
This finding is relevant to personality research because it has been 
shown by Eysenck (1947) that motor skills correlate quite highly 
with neuroticism, so that hereditary determination of individual 
differences in skill give presumptive evidence for hereditary deter- 
mination of individual differences in neuroticism. 

Related to this work are the studies by Becker & Lenz (1939) 
and Pauli (1941) showing that differences in work curves are more 
pronounced in fraternal than in identical twins. Irregularities in 


work curves have also been shown to be diagnostic of neuroticism 
(Eysenck, 1947), so that this finding too supports the tentative hypo- 
thesis that neuroticism may be based on a hereditary foundation. 
Perseveration has been studied by Yule (1935) and by Cattell 
& Molteno (1940). The former, using 115 twins, showed that on a 
battery of p tests of the Stephenson type heredity played an im- 
portant part; the latter, using 84 pairs of twins, found that f tests 
gave no evidence of hereditary influence. Tests of F ( 


association) were also given by Cattell & Molteno, who concluded 


that family-environmental differences ‘are about 8 times as import- 


ant as hereditary segregation of genes in accounting for individual 


differences in fluency’. This conclusion links up with the work of 
Carter (1938. 1939) and Sorensen & Carter (1940) on association 
in twins; it was found that identical twins are slightly more alike 
with respect to speed of association. 

Most of the studies mentioned so far have had only tangential 
relevance to personality; more directly relevant might be work 
On questionnaires and projective tests. Carter (1933, 1935) has 


reported on the use of the Bernreuter Personality Inventory a$ 


applied to 133 pairs of twins. Identical twins were markedly more 
similar than 


fraternal twins with respect to neurotic tendency, self 


fluency of 
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sufficiency and dominance. This conclusion does not agree with 
the results published by Newman, Freeman & Holzinger (1937) 
with respect to another neuroticism questionnaire. The Rorschach 
technique has been used by Troup (1938), Eckle & Ostermeyer 
(1939), Marinescu et al. (1934), and Kerr (1936). Results are con- 
flicting, Kerr’s being essentially negative, Marinescu’s positive, 
while the other reports are somewhat intermediate. No clear pic- 
ture emerges from the combined results, however. 

Certain techniques have been used by one investigator only, 
and while the connection between the test used and personality is 
not always clear, the results illustrate a point in methodology which 
we want to stress in the discussion. Szondi (1939) found greater 
similarity in identical twins when he applied the test that bears 
his name to 97 pairs of twins. Petó (1946) used psychoanalysis as 
his method of investigation, finding surprising identity of symptom 
Іп two identical twins. Malan (1940) found spatial orientation to 
be an inherited trait. Hunt & Clarke (1937) showed marked differ- 
ences in the startle pattern of a pair of twins. Carter (1932) found 
Occupational interests to be due in part at least to hereditary 
Causes. Frischeisen-Kóhler (1933) reports that personal tempo is 
definitely conditioned by heredity. Thompson (1943) showed deter- 
mination of play-behaviour by heredity. Waardenburg (1929) 
showed greater similarity between identical twins with respect to 
likes and dislikes. Zilian (1938) found less variability for identical 
twins on imaginal and motor factors. Steif (1939) found great 
similarity in scribbling between identical, little similarity between 
fraternal twins, a result similar to Luchsinger's findings with re- 
Spect to voice range (1940). 

Certain obvious characteristics emerge even from this very brief 
Teview of the literature dealing with twin studies in personality 
research. (1) Objectively oriented investigations are mostly very 
limited in scope, dealing with traits of a low order of complexity 
Such as scribbling, angle of handwriting, eidetic imagery, reaction 
times, or spatial orientation. (2) When an attempt is made to study 
higher-order concepts, such as criminality or psychosis, the con- 
Cepts chosen are of a sociological-ethical, or psychiatric rather 
than of a psychological nature, and the investigation proceeds 
along lines far removed from the objectivity of psychometric test- 
ing procedures. (3) While on the whole most authors agree that 
identical twins are more alike than fraternal twins on most of the 
tests used so far, there are many inexplicable contradictions (e.g. 
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the studies of perseveration, of questionnaire tests, ав ei Es 
Rorschach). (4) Most personality tests have such low re ial Ge 
that results are almost bound to be disappointing; To d 
attenuation is seldom attempted. (5) Even when results are d e: : 
cut, they are often difficult to interpret due to our lack of knowledg 
of precisely what it is a given test is measuring. . 

cita of the conflicting results reported may be due to ийин 
faults which vitiate many of the studies reported. Results are o "dw 
stated as impressions, rather than being reported as objective y: 
scored and statistically validated conclusions. Choice of tin pan 
to be tested is often based on faulty sampling practices, fraterna! 
twins which look unlike each other being overlooked in favour of 
those who resemble each other. (Correct procedures are M rae 
by Verschuer (1939) and Rosanoff et al. (1937).) Diagnosis o 
monozygoticity or dizygoticity has often been faulty, even where 
the procedures adopted have been described in full. These and 
other technical faults are easily overcome by experimenters of 
reasonable competence. Two other criticisms are more funda- 
mental, and must be discussed in some detail. 


(1) In passing, we have noted that the whole procedure of twin 
research rests on th i 


s it is for a pair of identical twins. 
Stocks (1930), Holmes (1930), Bracken (1933, 19344, 19345, 1935, 
; and Jones & Wilson (1933) present reports 
indicating that identicals are treated more alike than are fraternals, 
a fact which would appear to invalida 


kercken (1935), Lohmeyer (1935), 
(1941), on the other hand, 


mann (1935) and 
identical twins to 
1941) comments that 
erved would probably 
bilities and personality 
on (1934) has emphasized, while in 
tical pairs live under more similar conditions 
"this fact must be attributed ultimately to the 
eredity which led, or forced them to "select" 
more similar environments’. While it is impossible to be dogmatic 
on this point, it does appear that the argument against similarity 
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of environment for the two types of twins is speculative and hypo- 
thetical; there is no evidence to suggest that such differences as 
may exist are not themselves due to hereditarily determined selec- 
tion of environments, or that the small differences observed could 
account for the large differences in test results. Until more em- 
Pirical evidence is produced by the critics we cannot concede that 
their arguments do much damage to twin research methodology. 

(2) The following criticism of twin work has not to our know- 
ledge been made previously, although a solution to the problem 
posed by it has been published elsewhere (Eysenck, 1950). It is 
relevant not only to studies of temperamental traits, but equally so 
to work on intelligence and would seem to undermine the elaborate 
Structure of argument built up on twin research. Essentially, this 
Criticism concerns a conceptual jump which takes place when an 
argument is presented regarding the inheritance of intelligence from 
the intercorrelations of identical and fraternal twins on a particular 
lest, say the Binet. We may generalize and say that a demonstration 
that individual differences in performance on a given test are due 
to heredity cannot be used as proof that a hypothetical trait or 
ability imperfectly measured by that test is inherited. The argu- 
ment can best be presented by using an algebraic model. 

We may write the factorial equation of the Binet test in the 
following form: 


2 2 2 
9'BINET = og? + gy? + oy? + озр? + см + 
Og +... ox? + og? + og? 


Where о?вімет denotes the total variance of the Binet test, gef the 
Contribution to that total variance made by ‘g’ or intelligence, 
Whileoy?, oi, Gard, €m’, Oo? . . сх? denote contributions to the vari- 
ance by verbal, numerical, spatial, memory, comprehension, and 
other group factors, and es? and o? stand for the contribution to 
the variance of specific and error factors. Using estimates derived 
from Burt & John’s (1942) analysis of the Binet, the total variance 
Contributed by ‘g’ or intelligence is only about 30 to 40 per cent, 
Ог less than half the non-error variance. McNemar’s (1942) series 
of analyses attributes on the average 40 per cent of the total vari- 
ance to "e, the proportion ranging from 35 to 50 per cent. If we 
neglect the error variance, which amounts only to about 5 per cent 
and of course cannot be said to be caused either by heredity or 
environment, but which is merely an error of measurement uncor- 
Telated with the abilities or traits the test is measuring, we can 
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conclude by and large that one-half of the variance at most E iin 
to ‘g’, while at least one-half is due to various other common iac 
s. 
= n МЛ be clear now how unjustified is the jump from the state- 
ment ‘individual differences in Binet scores are accountable for in 
terms of heredity ro he extent of 80 per cent, which, granted 
certain assumptions, is a statement of fact, to the much more usual 
statement that ‘individual differences in intelligence are account- 
able for in terms of heredity to the extent of 80 per cent’, which is 
a completely unwarranted generalization which could be made 
only if all the non-error variance were attributable to ‘g’. It could 
be that all the group and specific factors hypothesized were com- 
pletely determined by heredity, in which case heredity would play 
only a very minor part in the determination of ‘g’; it could be that 
the group and specific factors were largely caused by environ- 
mental differences, in which case ‘g’ might be 100 per cent inherited. 
Unless we can analyse the total variance of a test into its constituent parts, 
and measure these parts separately, no scientifically tenable conclusion can be 
drawn from the data. If this criticism be justified, it follows that the 


whole literature on the inheritance of intelli 


gence, perseveration, 
social maturity, 


motor skill, conditionability, personality type, or 
any other ability or trait, in so far as it is based on twin studies, 
must be considered invalid. This conclusion may appear harsh in 
the extreme, but it is difficult to see how it can be avoided on the 
evidence to hand. 

It follows from what has been said above that if we want to 
measure the degree to which a particular trait or ability is inherited 
in a given sample, then we must study, not the individual test 
results, but rather the hypothetical underlying factors which 
generate the test variance. In studying intelligence, that would 
mean administering a battery of tests to the experimental popula- 
tion, intercorrelating these tests, factor-analysing the resulting 
factor matrix, and obtaining factor scores for each experimental 
subject on each factor isolated. We could then submit these factor 
scores to the mathematical treatment appropriate to our problem, 


and obtain data relevant not to one test only, but to intelligence, 


or verbal ability, or memory, or whatever our factors might turn 
out to be. 


In the field of cognition, the main factors underlying test DÉI" 
formance have been isolated by Spearman, Thurstone, Holzinget 
and other experimenters using the method of factorial analysis 
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We know now what to measure, and we know how to measure it. 
In the fields of conation and affection, however, the position is less 
clear. There are numerous theories, but few facts; much is hypothe- 
sized, but little known. The problem thus arises as to the correct 
choice of the experimental variable. 

Our choice has been determined largely by two considerations: 
(1) The social importance of the trait investigated, and (2) the 
existence of a sufficient body of knowledge regarding it. On both 
counts we had little difficulty in arriving at our decision to study 
the inheritance of the trait variously named neuroticism, emotional 
instability, or lack of integration, and discussed in detail in earlier 
chapters. We may state the hypothesis investigated in this paper in 
a formal manner: The trait of neuroticism, as operationally defined 
їп terms of the pattern of intercorrelations between a specified set 
of objective personality tests, is in large part determined by 
heredity, and such individual differences with respect to it as 
appear in the experiment cannot be accounted for in terms of 
environmental influences. 

‚ The relevance of the experiment to the psychological and psy- 
chiatric fields needs little stressing; it may be worth while, how- 
ever, to point out the importance attaching to the result from the 
Point of view of the logic of factorial analysis. It will have been 
noted that our definition of neuroticism is essentially in terms of 
factorial analysis, ie. in terms of the condensation of a set of 
observed correlations into a smaller number of hypothetical under- 
lying variables or factors. This method has frequently been criti- 
сіе on various grounds, and while some of these criticisms have 
not always been based on thorough knowledge of precisely what is 
implied in the method, it cannot be denied that some doubts could 
Not be allayed in terms of statistical arguments alone. In particular, 
the fact that the resolution of a given matrix of intercorrelations 
Into factors can be carried out in an infinite number of ways has 
Perplexed many critics otherwise not hostile to this approach. 
Thurstone’s method of overcoming this difficulty (a difficulty which 
15 not faced at all by some writers like Burt and Stephenson) is 
well known; it consists essentially in overdetermining the solution. 

his method, while of the utmost importance in work on abilities, 
appears less well suited to the requirements of non-cognitive experi- 
mentation, and the method of ‘criterion analysis’ has been suggested 
85 a plausible alternative (Eysenck, 1950). 


However, proof has been lacking hitherto that factors thus 
S.S p. 
N 
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determined have any ‘real’ existence, and are something me 
than mere ‘statistical artefacts. In part the argument about rea. 
existence is of course a philosophical and semantic one; in p 
definite sense any scientific concept is an ‘artefact lacking rea 
existence. A concept, whether it be that of an electron, an instinct, 
a quantum of energy, a complex, or a sound wave, is an abstrac- 
tion, and thus not ‘real’; scientific concepts are ‘artefacts almost 
by definition. However, what critics usually mean by this objection 
is something quite different. They denote a factor an ‘artefact if 
it does no more than merely summarize existing knowledge, if it 
does not go beyond the circle of its own derivation, 

Here the present experiment should be of crucial importance. 
Having defined our factor of neuroticism in terms of the intercor- 
relations of a set of tests, we proceed to examine the biological unit 
of this factor by analysing the degree to which the factor as a whole 
is inherited. The crucial question, therefore, is this: Is the degree 
of hereditary determination of this factor greater than that of any 
single test contributing to the total factor variance, or is nothing 
gained by substituting the ‘factor score’ for the score on any of the 


tests which jointly define the factor? If the result shows that the 
factor is inherited to a mo 


follows that we have suc 


In any study involv 
twins, care must be taken to avoid error: 
unrepresentative samples. Man 
to criticism because of their m 
of twins is carried out as in th 


Holzinger (1937), by inquiry 


s that might arise from 
y previous studies have been open 
ethods of sampling. If the selection 
€ study by Newman, Freeman and 
into local schools, there exists the 
possibility of overlooking those fraternal twins who are quite dis- 
similar, thus yielding an underestimate of the average difference 
between fraternal twins. This is because twins who are much alike 
! The experiment was carried out by D. Prell and the writer in collabora- 
tion; a detailed description has been published elsewhere (Eysenck & Prell, 
1951). We wish to record our indebtedness to Dr. R. R. Race and staff of the 
M.R.C., Blood Group В. 


н d esearch Unit, the Lister Institute, for their kindness 
in performing the laboratory work on the determination of blood groups for the 
twins, 
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attract attention and are brought to the investigator’s notice, while 
those differing considerably in appearance and behaviour may be 
overlooked. The questionnaire method of sampling is even more 
likely to yield a sample overloaded with fraternal twins who are 
very much alike. Probably the most adequate method of securing 
an unbiased sample is to use the birth record method (von 
Verschuer, 1939). This was the method used in the present study 
(Eysenck & Prell, 1951). 

The birth records for five boroughs in south London were 
searched for all twins of the same sex, born during the period 
1935-1937. The reason for selecting only like-sex twins was that 
identical twins of necessity belong to the same sex. If then, fraternal 
twins of the opposite sexes had been included, they would have 
introduced a possible complication due to sex differences. Con- 
cerning the age limits, the lower limit was set because children 
younger than 11 could not have taken all of the tests; the upper 
limit was set because a wide age-range necessitates statistical 
Corrections for age which complicate the picture. 

The survey of the birth records yielded the names of 130 pairs 
of like-sex twins, From these it was possible to locate 68 pairs who 
Were living in the London area, close enough to be able to attend 
the Psychological Laboratory of the Institute of Psychiatry. The 
remainder were either living too far away, had died, or could not 
be located. In no case was parental permission to test refused. 

he twins were examined as they were located. After examination 
they were classified as identical or fraternal according to a pro- 
cedure to be described presently. 

Although there is no longer any doubt of the existence of two 
types of twins, efficient criteria are needed in order to effect a valid 
Separation in all cases. Two methods of diagnosis have been used 
to group identical and fraternal twins: the foctal-membrane method 
Ce the similarity method. In view of the many criticisms made of 

€ former, the latter was employed. 

he similarity method involves the comparison of the members 
ot a pair of twins in respect to numerous physical characteristics 
am are determined by heredity. As the number of characteristics 
ali creased arithmetically, the chances of any two siblings not being 
eg Ke all the characteristics is increased geometrically. There- 
in o the chances of two children in the same family being alike 
x ike such characteristic is one in two, the chances of their being 

1n ten is one in one thousand. 
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In the present investigation the set of criteria upon which the 
diagnosis of zygosity was made is as follows: 


(1) Close resemblance of ears, teeth and facial features. 

(2) Iris pigmentation. 

(3) Standing height. NEN | 

(4) Presence or absence of mid-digital hair. 

(5) Ability to taste phenyl-thio-carbamide. 

(6) Scapular shape. | 

(7) To (14) Blood groups A,A,BO, Rh, MNS, Р, Lewis, Kell 
and Lutheran. 


(1) Close resemblance of ears, teeth, and facial features was 
rated on a three-point scale; no resemblance, different (D); pro- 
nounced resemblance, but slight differences (SD); and very pro- 


nounced resemblance rendering it almost impossible to distinguish 
the twins, same (S). 


(2) The resemblance of the iris pigmentation was rated on a three- 
point scale; no resemblance, 


different (D); pronounced resemblance, 
but slight differences in one zone (SD); and very pronounced 
resemblance, rendering it almost impossible to distinguish his 
twins, same (S). 
(3) The standing heigh 
quarter of an inch, 


(4) Presense or absence of hair on the dorsum of the mid-digital 
region of the fingers was determined by placing the subject’s hand, 
half clenched and bent slightly backwards, between the investi- 
gator and a source of light. This was provided by a roo-watt 
electric light placed seven feet behind the subject's hand. Presence 
of any hair on one or more fingers was scored as: hair +. 

(5) Ability to taste phenyl-thio-carbamide was ascertained by having 
the subject drink one-quarter teaspoonful of a 1/20,000 solution 
of PTC. The subject was asked what the substance tasted like. 
Any answer*other than water was scored as: taste +. 

(6) The scapular shape was found by running the hand over the 
inside edge of both scapule. The twins were classified as either 
concave (CC), straight (S), convex (CV), or mixed (M). The mixed 


¢ of each twin was measured to the nearest 


Lin the majority of individuals. 
and an outer of different pigmenta 
(1943), no two pairs were found t 
to both zones, although in ten 
four pairs the in: 


» the iris is composed of two zones, an inner 
tion. In fifty pairs of siblings studied by Rife 
© have the same iris pigmentation in regar 


: 5 n 
pairs the outside zones were the same, and i 
side zones were the same. 
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type included any combination of the other three: (CC-CV), 
(CV-S), etc. 

Grouping the Twins into MZ and DZ. Using the data on the 14 
criteria, the twins were first classified into one of three groups: 
definite-MZ, definite-DZ, and doubtful. A pair of twins was considered 
defnite-MZ if they had (S)’s in criteria (1) and (2); differed less 
than 1-5 inches in height on criteria (3); and if the children agreed 
exactly on criteria (4) to (14) inclusive. Twins were considered as 
definite-DZ if they were rated (D) on both criteria (1) and (2); 
or if their height differed by more than 3:5 inches, criterion (3); 
or if they differed on any one of criteria (4) to (14) inclusive. Twins 
were considered doubtful if they had ratings of (SD) on both criteria 
(1) and (2), or a rating of (SD) on either (1) or (2) and a rating of 
(S) on the other; if they differed 3:5 inches or less on criteria (3); 
and if they agreed exactly on criteria (4) to (14) inclusive. 

On the basis of the above procedure, 20 pairs of twins were 
Classified as definite-MZ, 24 pairs as definite-DZ, and 6 pairs as 
doubtful, 

When a pair of twins had been rated as doubtful, the blood 
8roups of both parents were ascertained in order to effect a final 
Classification.? The blood groups of the twins and their parents 
Were then compared to determine the chances of the blood groups 
of two siblings from a known mating being alike on all of the blood 
Broups. In five of the doubtful pairs, the chances of the two children 
cing alike on all of the blood groups, taking into account the 
blood Broups of their parents’ blood, were: 1:256, 1:256, 1:1000, 
1:1000, and 1:2000. Accordingly, these pairs of twins were classified 
М7. In the other doubtful pair the chances were: 1:16, therefore 
this set was classified DZ. 
he research design calls for a criterion group of neurotic chil- 
against which the factor extracted from the normal twins 
Could be validated. Twenty-one children born between 1935 and 
1937 were selected from out-patients at the Maudsley Child 


dren 


В e Newman, Freeman, & Holzinger (1937) found that 94 per cent of their 50 
ets of MZ twins had pair differences in standing height of less than 1-5 inches; 
53:8 per cent of their 50 pairs of DZ twins had pair differences of less than 
1°5 inches, None of their MZ twins had a pair difference of over 3-1 inches; 


Whereas 19-1 per cent of their DZ twins had a pair difference of over 3:1 
Inches 


2 

Ba Blood Broup tests were not made for the parents of all 50 pairs of twins as 

Kee s feared that an attempt to persuade the parents to submit to a ‘blood-test? 
‘ght have resulted in a loss of the co-operation already secured. 


182 HEREDITY AND ENVIRONMENT 


Guidance Clinic. Great care was taken to exclude children with 
organic complications, or with possible psychotic traits, or who 
were not definitely considered ‘unstable’ by the examining psy- 
chiatrist. The resulting sample of twenty-one children approaches 
as closely as is possible at the present stage of psychiatric knowledge 
a ‘pure’ neurotic group, with relatively little mixture of other 
mental or physical disorders. 

The following tests were used in this study: it should be remem- 
bered that the choice was made before our knowledge of this factor 
was as extensive as it is now, and that consequently a much better 


choice could be made at the present time than was possible when 
this experiment was being planned. 


Tapping Speed (cf. photograph 22). 


Track Tracer, cf. photograph 12). 
hoice of ‘higher’ of two playing cards, 
ards on the table; cf. photograph 21). 


( 
(8) Body-Sway Suggestibility. 
(9) Dynamometer Strength of Grip. 
(10) Word Dislikes. 
(11) Brown Personality Inventory (Adapted). 
(12) Lie Scale (adapted from M.M.P.I.). 
(13) Flicker Fusion. 
(14) Autokinetic Movement. 
(15) Autokinetic Suggestibility. 
(16) Speed of Writing S’s backwards. 
(17) Fluency. 


(A detailed description of the tests is given in Eysenck & Prell, 
1951; short descriptions of most of them will be found in other 
chapters of this book.) ; 

The intercorrelations between the I7 tests are reported: 10 
Table ХХХ for the 100 twins, as well as correlations with zygosity, 
E and age. Table XX XI records saturations of the tests for three 
significant factors, which leave only insignificant residuals. Also 
given in Table XXXI are the correlations of each test with the 
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criterion * (these are biserial correlations; all others are pe 
moment correlations). Items r3 has been omitted from further 
calculations as it was impossible to give this test to the neurotic 
group. In order to rotate the factors so as to obtain maximum 
correlation with the criterion, those tests which showed negative 
correlations with the criterion were multiplied by — 1 as indicated 
in Table XXXI; this is merely a device which reverses the direc- 
tion of scoring of the tests affected, and leaves the data unchanged 
in any material way. The last column of Table. XXXI gives the 
rotated factors. The new factor 1 correlates +758 with the criterion 
column, having been rotated into maximum agreement with it. 
Factor II was then rotated in such a manner as to preserve 
orthogonality with Factor I, and to take up all the remaining 
"variance on the intelligence test. Factor III is irrelevant to our 
purpose, and no interpretation of it will be attempted. Interpreta- 
tions of Factors I and II are straightforward and dictated by the 
results: Factor I is a factor of neuroticism, Factor II one of 
intelligence. 

Table XXXII gives the means and variances for the neurotic 
children, the identical twins, and the fraternal twins, on all 17 tests 
and also on the factor score for the ‘neuroticism’ factor. The first 
three columns give the means for the three groups; only two of the 
differences between identical and fraternal twins are significant at 
the 5 and 1 per cent levels respectively, 


and the Lie Scale. It is difficult to interpret these results, particu- 
larly in view 


of the fact that the inventory did not discriminate 
at all between normal and neurotic children. Why identical twins 
should be more given to lying than fraternal twins we cannot 
explain! The next three columns give the variance for all the 
children as individuals; the only difference, at the 2 per cent level 
of significance, is on the Static Ataxia test, where identical twins 
are more variable than fraternal twins. (Only differences between 
the two types of twins are reported, as no particular interest 
attaches to differences between normals and neurotics, apart from 


the correlation of each test with this dichotomy, which is given in 
Table XXXIII.) 


viz. the Neurotic Inventory 


The last two columns give variances for identical and fraternal 
ns taken as pairs. In sixteen out of eighteen cases the identica 


1 The criterion, of course, is the difference for each test of the scores betwee? 
the 100 normal twins and 


the 21 neurotic children who form the control 
group. 
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twin variance is larger; two of these differences are significant at 
the 2 per cent level (Static Ataxia and Neuroticism Factor Score). 
There seems to be little doubt that in our sample identical twins 
and fraternal twins have almost identical means with respect to 
neuroticism (23:20 and 22:96 respectively), but that the identical 
group contains pairs of twins tending to be more extremely un- 
stable, and more extremely stable, than the fraternal twin group. 

Table XXXIII shows the raw intraclass correlations between 
twins of both types, correlations with age, and the partial correla- 
tions resulting from eliminating age differences. It will be seen that 
age plays little part in determining scores, and that correction 
leaves the correlations very much as they were before. The last 
column of Table XXXIII gives Holzinger’s h? values, i.e. the 
extent to which each test variance is determined by heredity. It 
will be seen that the test of intelligence has an Л? (-676) which 
IS almost identical with the А? values given by the best neuroti- 
cism tests: static ataxia (-692), autokinetic movement (:648) and 
suggestibility (-701). 

As the last step in this procedure, factor scores on the neuroticism 
factor were calculated for each twin and A? values calculated for 
these ‘neuroticism scores’. 


jo мы. AN odi Rab be 
І — 7 (1 — 7%) 
where r; — intraclass correlation for identical twins and ry = intra- 
class correlation for fraternal twins. It follows that 


13:680 
‘total’ 2 variance 91:551 


EE a TT ‘within’ 2 variance _ 
i) = oe 


= :149, from which 


ks ‘within’ f variance _ 32:480 
r= Ber. § _ 4) _ Wit - 

gus ашан ТЕ a ‘total’ f variance 41:468 
= +783, from which ry = 217. It follows that 4? = ‘810. 

This value is considerably higher than that given by any single 


test, and indicates that the factor constitutes a biological unit 
Which is inherited as a whole. 


The A? technique used in this paper gives the percentage of 
twin difference variance attributable to nature providing that 


* These mean scores of the normal children should be compared with the 

mean score for neurotic children (13:34). There is relatively little overlap 

€tween the two distributions of scores, and the difference between the normal 
and neurotic group means is significant at a very high level. 
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certain assumptions are met. Two of these assumptions are that 
nurture influences are the same for both types of twins, and that 
differences due to nature are uncorrelated with differences due to 
nurture. It is probable that these assumptions are not completely 
met, and that consequently our estimate is too high. On the other 
hand, another assumption, viz. that the variance due to errors of 
measurement is negligible, is quite certainly not fulfilled, and in 
view of the known unreliability of personality tests we must assume 
that errors of measurement may play a considerable part. This 
would lead one to believe that the found 4? would be an under- 
estimate of the true value, and it seems not impossible that this 
factor may cancel out the two previously mentioned. It is perhaps 
Permissible, therefore, to argue that the 4? found is a rough and 
ready estimate of the contribution which heredity makes to 
individual differences in neuroticism.! 

Our conclusion regarding the size of this contribution is not in 
agreement with that arrived at by Newman, Freeman, & Holzinger 
(1937). Where they find that *the only group of traits in which 
identical twins are not much more alike consists of those commonly 
classed under the heading of personality . . ^, we have shown that 
identical twins show a correlation on neuroticism of -851, while 
fraternal twins show a correlation of only -217. From this it was 
concluded that individual differences with respect to neuroticism, 
stability, integration, or whatever we may wish to call this trait or 
factor, are determined to a very marked extent by heredity, and 
Very much less markedly by environment. This conclusion, of 
Course, applies only in the general type of environment from which 
all our twins came, and might not be applicable under conditions 
of more extreme environmental variation such as may obtain in 
Other cultures. 

Having demonstrated how strong an influence heredity exerts 
in at least one field of personality, we cannot close this chapter 
Without drawing attention to certain important theoretical points 
Which would seem to be directly related to this demonstration. 
These points arise largely out of what to us appears a somewhat 
unhealthy tendency in current psychological literature to repeat 
in the field of affection and conation the errors and misinterpreta- 
tions which vitiated the early work on intelligence. We are referring, 
of course, to the unconquerable tendency among certain writers to 


1 We have used Holzinger's A? statistic in our estimates because no better 
estimate of the contribution of heredity to the total variance is available. 
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interpret their results in terms either of hereditary or dew 
mental influences, although the methodology of their researc oes 
not permit any conclusions to be reached on this point. We have 
learned through the lost labours of many years that an experi- 
mental observation to the effect that dull parents have dull chil- 
dren is irrelevant to the question: To what extent are intellectual 
differences innate? We know now—as we should always have 
realized—that both hereditary and environmental causes may 
with equal facility be invoked to explain the observation. But we 
do not appear to have learned to use ‘transfer of training and 
apply this lesson to other fields than the intellectual. The literature 
is replete with papers in which the fact that trait A in a group of 
children is found to be correlated with trait X in their mothers is 
used as an argument to the effect that the behaviour characterized 
by trait X is directly causative of the behaviour characterized by 
trait A. No appeal to the elementary text-book principle that 
correlation does not and cannot prove a direct causal relation appears 
sufficient to stem this flood of environmentalistic interpretation, 
and even otherwise critical reviews often fail to point out the fact 
that both trait A and trait X may be due to a third variable, 
namely, an inherited tendency to behave in this manner, and may 
be completely unaffected by environmental manifestations. 

Such neglect of caution in interpreting results could perhaps, 
according to current psychiatric thinking, be traced back to 
emotional causes and tie-ups which the present writer is not 
competent to disentangle. It is perhaps significant that both 
behaviourism and psychoanalysis, divergent as their paths other- 
wise appear to be, are at one in this respect; indeed, even the 
apparent differences between the democratic ideals of the U.S.A. 
and the U.K., and the dialectical authoritarianism which char- 
acterizes the ‘dictatorship of the proletariat’ in the U.S.S.R. melt 
away before this belief in the ‘equality of man’. Slater 
put this criticism into 
terms. “There has . 
to minimize the e 


(1950) has 
particularly strong, but entirely justifiable 
. . been an increasing tendency among clinicians 
ffects attributable to genetical causes, and to 
teach a psychiatry in which they receive little or no mention. This 
tendency has been marked in Britain, but it has assumed formid- 
able strength in the U.S.A. Instead of a harmonious development, 


in which the psychoses and neuroses, constitution and environ- 
ment, psychogenesis and 


Pe physiogenesis receive their due share of 
attention, interest among practical workers has been devoted more 
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and more exclusively towards psychotherapy, psychoanalysis, social 
Psychiatry, personnel selection, group therapy, and preoccupa- 
tions with anthropology, sociology, and political theory. In its 
one-sidedness, this development is not healthy. 

‘It is a sign of bad omen that it is possible for text-books of 
clinical psychiatry to appear, with claims for comprehensiveness, 
in which no mention is made of the established facts of genetics and 
of the hereditary element in mental disorder. Their authors appear 
to feel, though in fact this view depends on a misapprehension, that 
recognition of a hereditary factor implies a therapeutic nihilism; 
ànd that an energetic and optimistic attitude towards treatment 
calls for a neglect of hereditary factors, just as a due appreciation 
of the patient as an individual demands forgetfulness of nosological 
entities. One suspects that the prime motivation is derived from 
the philosophy of Dewey, which is no doubt over-simplified in the 
notion that one should accept as true that which has convenient 
Practical applications. 

_ It would not perhaps be putting it too high to say that we are 
witnessing the manifestation of an anti-scientific tendency which 
1s winning an increasing number of supporters. The customary 
Canons of scientific reasoning are ignored by these schools. Uncom- 
fortable facts are left unconsidered. Hypotheses are multiplied 
regardless of the principle of economy. Explanations which may 

€ valid for certain members of a class of phenomena are regarded 
as true for the class as a whole. Interpretations which conform 
With theory, and which might be true, are regarded as established. 
Possible alternatives are not considered, and no attempt is made 
to seek for evidence of critical value which shall decide between 
them. Criticisms from outside are ignored, and only the initiate 
may be heard. Utterance is dogmatic and arrogant, and lacks 
Scientific humility and caution. These are the mental mechanisms 
Which we associate with the growth of a religious orthodoxy, and 
not with the progress of science. The movement is of significance 
to genetics, because it is likely adversely to affect the personnel 
and facilities for research, and to lead to a psychiatry without 
biological foundation and divorced from contact with the other 
Natural sciences.’ 

We may perhaps with some advantage analyse in some detail 
One particular study to see how this particular idol of the market- 


Place can find its way into the conclusion of what is otherwise an 
admirable experiment. 
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The experiment chosen for this purpose may justifiably claim 
to be the first attempt to apply factor analysis to psychoanalytic 
hypotheses; indeed, apart from serving as our demonstration mode 
the work to be discussed has many claims to be included in this 
volume for the sake of the positive contribution it has to make. 
would have been easy to find a much more obvious and vulnerable 
piece of work; we have chosen on purpose to deal rather with а 
particularly subtle and well-controlled experiment. 

. The study under review deals with the influence of breast feed- 
ing on character-formation (Goldman-Eisler, 1948, 1959 1951, 
The very first sentence sets the pattern: ‘The description of adult 
character in terms of childhood experience is one of the basi 
principles of psychoanalytic characterology, and indeed the genetic 
approach (in the sense of ontogenetic, or referring to childhoo 
experience) to human personality is the essence of the theory an 
method of psychoanalysis.’ The particular hypothesis chosen 12 
this experiment is concerned with oral character traits and be 
origin, as outlined more particularly by Karl Abraham (2917 
1924, 1942) and E. Glover (1925). Oral character traits ате believe 
to originate from repressed or deflected oral impulses which are 
dominant during the nursing period and which have undergone 
transformation into certain permanent behaviour patterns by 
processes ofreaction-formation, displacement, or sublimation. МЕ 
main syndromes are posited by these writers to emerge from 
experiences of gratification or frustration attached to the oral stag? 
of libido development. One of these is the orally gratified type 
(late weaning), who is supposed to be distinguished by an id 
perturbable optimism, generosity, bright and sociable conduct; 
accessibility to new ideas, and ambition accompanied by sanguine 
expectation. The other, the orally ungratified type (early weaning 
is characterized by a profoundly pessimistic outlook on life, some" 
times accompanied by moods of depression and attitudes of with- 
drawal, a passive-receptive attitude, a feeling of insecurity, а nee 
for assurance, a grudging feeling of injustice. 

We thus have two hypotheses: (1) Certain traits tend to cor- 
relate together ina well-defined way so as to give rise to a factor of 
Carl ‚апа (2) the position ofa person on the continuum defined 

y this factor is determined by his experiences of early or late 
weaning. The first of these hypotheses is clearly testable by means 
of a factorial analysis; the second is subject to a more ш test 
using analysis of variance. One hundred and fifteen middle-class 


"sat BUIy[RM-]IPY "SI "eat ә2иә]515ләа 397] */ | 


KAS у) 
© 


UES 


20. Test in action. 


HEREDITY AND ENVIRONMENT 193 


TABLE XXXIV 


. (Reliability coefficients are presented in parentheses following the descrip- 
tion of each trait.) 

Optimism: Optimism and pessimism are antithetical expressions in char- 
acter-formation of an omnipotent and magical relation to realjty. Optimism 
deriving from this unconscious source would be upheld by the individual 
against reasonable expectation (r = -90). 

Pessimism: Inability to accept frustrating experiences as part of reality. 

cfence against disappointments through summary and advance resignation, 
anticipation of disappointment (7 = :74). 
o Pi cra Passive-receptive attitude, hedonism, indolence, self-indulgence 

= 81), 

Desire for the Unattainable: Intense desire to climb combined with a feeling 
of unattainability, of difficulty in achievement, of the insuperable, grudging 
incapacity to get on (r = -78). 

Displaced Oral Aggression (Verbal): Aggressive use of speech, ‘omnipotent 
valuation of speech’, ‘incisive speech’. 
for Aggression: The desire to injure or inflict pain, to overcome opposition 
(r = SCH to fight, to revenge an injury, to attack, to oppose forcefully 

Aloofness: Negative tropism for people, attitude of rejection (r = +73). 
"ES oos Tendency to overcome obstacles, to exercise power, to strive to do 
t ing difficult as well and as quickly as possible, to attain a high standard, 

© excel one’s self (r = *68). 

" Autonomy: Those who wish neither to lead nor to be led, those who want 
P Da own way, uninfluenced and uncoerced. Independence of attitude 
бї Dependence: Tendency to cry, plead or ask for nourishment, love, protection 

aid. Helplessness, insecurity (r = :56). 

4 uilt: ‘Conscience’, inhibiting and punishing images, self-torture, self- 
asement (r = -go). 

РА Change: А tendency to move and wander, to have no fixed habitation, to 

за ew friends, to adopt new fashions, to change one's interests and vocation. 

Onsistency and instability (r = +71). 
rixa nalis: Adherence to certain places, people, and modes of conduct. 
Doe; on and limitation. Enduring sentiments and loyalties, persistence of pur- 

үз Consistency of conduct; rigidity of habits (r = +56). cU 
ji 00 The tendency toact quickly without reflection. Short reaction time, 

D S or emotional decisions. The inability to inhibit an impulse (r = -82). 
ene iberation: Inhibition, hesitation and reflection before action. Slow 

on time, compulsive thinking (r = +87). 
So aExocatlexis: The positive cathexis in practical action and co-operative 
c ertakings. Occupation with outer events: economic, political, and social 
d fie Ge A res inclination to participate in the contemporary world of 
= :72). 
" Endocathexis: The cathexis of thought or emotion for its own sake. A pre- 
ке EN with inner activities, feelings, fantasies, generalizations, theoretical 
G "a artistic conceptions, religious ideas; withdrawal from practical life 

Nurturance: Tendency to nourish, aid, and protect a helpless object. To 
express sympathy, to mother a child, to assist in danger (r = -79). 
“ike Tendency to form friendships and associations; to join, and live 

d a To co-operate and converse socially with others. To join groups 
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adults constituted the experimental population; for all of these 
statements from their mothers regarding process and time of wean- 
ing were available. Nineteen scales, mainly of a questionnaire 
type, were constructed or modified from existing scales, to measure 
traits which according to the hypothesis under investigation were 
related to ‘orality’. Table XXXIV gives the titles of these scales, 
together with a split-half reliability coefficient for each scale. (Th s 
average number of items in each scale was 8; average reliability 
was -76.) The full scales are published elsewhere (Goldman-Eisler, 
1951). | is 
These 19 scales were intercorrelated, and the resulting table 1 
given below (Table XXXV). A factor analysis was carried mi 
and the saturations for two factors are reported in Table XXXVI. 
It will be seen that the first factor does indeed resemble the hypo- 
thetical ‘oral’ type—aloofness, pessimism, endocathexis, conserv a 
tism, deliberation, guilt, and aggression forming the ‘ungratifie 
type, and exocathexis, optimism, nurturance, sociability, change: 
and impulsion forming the ‘gratified’ type. So far, then, tha 
hypothesis appears to be confirmed. (We shall not here discuss 
the interpretation of the second factor, as this would take us tO” 
far afield. The interested reader is referred to the original pape! 
(Goldman-Eisler, 1951).) i 
We must now turn to the second hypothesis, linking this уре 
with early or late weaning. An analysis of variance was carried ou 
on ‘early weaners’ and ‘late weaners’, defining these in terms 0 
weaning at not later than four months of age, and not earlier than 
five months of age respectively. This reduced the number of cases 
to 89. A second analysis of variance compared ‘early weaners', а5 
defined above, and ‘very late weaners', defined as having been 
weaned at not earlier than nine months of age. Scores for this 
analysis were estimates of factor scores. The analyses are given 
below in Tables XXXVII and XXXVIII. It will be seen that 1n 
both cases the results are in the expected direction, and that in 
both cases they are significant at the 1 per cent level. The correla- 
tion between ‘early weaning’ and ‘gratified orality’ is -271 accord- 
ing to the first analysis, and *305 according to the second analysis. 
An alternative method for assessing the correlation between 
type and weaning is available. We can include * 
our factor analysis, and establish its saturation w 
This turns out to be -337, thus confirming the pr 
shows a correlation of approximately -3 bet 


early weaning’ in 
ith the first factor. 
€vious result which 
ween oral type and 
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TABLE XXXVI 


| Factor Saturations | Factor Saturations 
Trait > a || Trait 
I п 
Optimism — 680 р” Guilt 
Pessimism 542 | —o41 | Change 
Passivity 376 214 || Conservatism 
Unattainable 050 209 || Impulsion 
Oral Aggression —156 303 | Deliberation 
Aggression 209 479 | Exocathexis 
Aloofness 750 294 | Endocathexis 
Ambition —192 153 | Nurturance 
Autonomy 156 560 | Sociability 
Dependence 245 —296 | 
TABLE XXXVII 
Degrees o Sum of Mean 
Source Tia 07 Squares Square 
` e 
Within weaning group 88 22:267 '253 
Between early and late weaning group 1 2:033 


89 


F = 8:04 Epsilon = -271 


TABLE XXXVIII 


Source Degrees of 
Freedom 


Sum of Mean 
Squares Square 
ae 


16:271 7254 


Within weaning group 
Between early and very late weaning 
group 


64 


NC 


F — 750 Epsilon — 305 


1:904. 1:904 


18-175 -280 


weaning. We might thus conclude that both h 
verified; we have found the hypothetical typ 
posited, and we have found this type to be relat 
the predicted direction at a high level of confid 
point that we must bring our criticism to bear, 


'ypotheses had been 
€ to exist roughly as 


ed to weaning in 
ence. It is at this 
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Let us first of all look closely at the traits characterizing the 
oral type. Comparison with results of factorial analyses published 
elsewhere in this volume will show that the structure of traits found 
here is very similar to that which has given rise to the hypothesis 
of an introvert-extravert dichotomy; the results of the factorial 
analysis agree with this interpretation at least as well as they do 
with that made by Goldman-Eisler. This, of course, is not entirely 
unexpected; when experienced psychologists and psychoanalysts 
set down syndromes of trait resemblances which they have observed 
in their experience it is not unlikely that they should agree in their 
Observations, even though they may give different names to the 
syndromes thus isolated. It is important, though, to remember that 
or Jung and his followers extraversion or introversion are largely 
determined by constitution, while for Freud and his followers the 
oral’ type is determined entirely in terms of early childhood 
€xperience. Thus the results of the factorial study do not confirm 

reud any more than they confirm Jung; as far as the crucial 

ifference between them is concerned the results are neutral. 
. What can we say of the correlation of this typology with wean- 
Ing? Surely the existence of a correlation as small as :3—even 
though it be fully significant —cannot under any circumstances be 
Siven a causal interpretation. It is equally plausible to suggest that 
introverted mothers tend to have introverted children by the action 
of Senetic factors; that introverted mothers tend to wean their 
children earlier because of their lack of ‘exocathexis’; and that 
Consequently there will arise a correlation between early wean- 
ing and child’s introversion which can all too easily mislead the 
Investigator to believe in some principle of direct causation. 

here is an obvious objection to this line of argument. It may 

* maintained that in criticizing the experiment we have left out a 
crucial part. Did not the investigator set out to test.the predicted 
consequences of a clearly formulated theory? What more can a 
Single experiment do than to verify such a deduction? Is not too 

igh a standard being applied to experiments of this kind, a 
Standard which contradicts the logic of scientific methodology? 

The answer to this objection is very simple. The deduction 
from a theory the testing of which may serve as a confirmation of 
the theory must refer to phenomena which themselves are as yet 
unknown. This essential requirement is not fulfilled in the present 
Case. Abraham and Glover based their hypothesis on the observa- 
Чоп that among their patients there appeared to be a close con- 
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comitance between personality and certain weaning practices 1n 
their childhood. From this observation they erected the hypothesis 
of the oral type. But clearly this hypothesis cannot be tested simply 
by confirming the original observation of a correlation between 
oral type and weaning! To test it would require a deduction 
different from the original observation. Such a deduction should 
preferably be made in such a way as to provide a crucial exper” 
ment between the two rival hypotheses outlined above. In the 
absence of such an experiment, all that the Goldman-Eisler study 
has done is to confirm in a very impressive manner Abraham’s and 
Glover’s original observation of a correlation. The interpretation 
of that correlation is not affected in any way by her experiment, 
and it would be indicative of ‘environmentalist’ prejudice to inter- 
pret it as favouring the Freudian view, just as it would be evidence 
of a ‘hereditarian’ prejudice to interpret it as favouring the Jungian 
view. 

| Jt is because the former prejudice is so widespread and $0 
insidious that we have included this extended discussion here 
Implicit assumptions which govern interpretation of data in the 
absence of factual evidence may be far more harmful to a science 
than explicit errors which can easily be put right. The environ- 
mentalist assumption underlies the greater part of contemporary 
work in the field of personality, and still determines interpretation 
of strictly neutral material in the complete absence of confirmatory 
evidence. 'This does not mean that the environmentalist hypothesis 
is necessarily false; we have no knowledge on which to base an 
opinion. The need, as always, is for well designed experiments 


which will enable us to cease arguing, and substitute fact for 
interpretation. 


Chapter Six 


THE PSYCHOTIC DIMENSION 


Possible relationship between the factor of extraversion-introver- 
ston isolated there, and the schizothymia-cyclothymia typology 
advocated by Kretschmer. It was felt that the facile equation of 
the introvert with the schizothyme, which many writers in the 
field of personality have assumed as a fact, could not be maintained 
In view of the explicit statements by Jung (1923) and Kretschmer 
1948) regarding the position of the hysteric in their respective 
typologies. For Jung, the hysteric is a prototype of the extrovert; 
or Kretschmer, the hysteric has close affinities to the schizophrenic 
Soe? belongs, therefore, quite definitely to the schizothyme group. 
5 it is clearly impossible for the hysteric to be both extroverted 
and Schizothymic; if we equate introversion with schizothymia, it 
15 Impossible to regard these two typologies as identical. In view of 
© absence of independent evidence on this point, no positive sug- 
8estions were made regarding the position of Kretschmerian types 
in our own dimensional system of personality. However, clearly, 
this was an unsatisfactory position and an experimental integra- 
Don of the two systems became a priority in our experimental 
Programme, 

Another problem arose in this connection which also demanded 
ап empirical answer. As will be shown, Kretschmer’s hypothesis of 
a Cyclothymia-schizothymia dimension of personality is dependent 
on the assumption of an essential continuity between normal and 
PSychotic mental states; in other words, he postulates a normal- 
PSychotic continuum, or a factor of ‘psychoticism’. In the same 
Way, Jung implies a factor of ‘neuroticism’ in making neurotic 
Syndromes prototypes of his normal typology. The question arises 
Whether these two dimensions of neuroticism and psychoticism are 
identical (assuming they both exist), or whether they must be 
regarded as clearly separate and unrelated. Existing theories are 
Not at all helpful on this point. Freud, who posits a general factor 
9* regression, would identify the neuroticism and psychoticism 
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I: Dimensions of Personality the writer has discussed briefly the 
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factors, describing the psychotic as one who has regressed ees? 
severely, and the neurotic as one who has regressed less am ү, 
the normal person, presumably, showing signs of regression. 0: e 
orthodox psychiatrists seem to assume that neuroticism and E 
choticism are situated along two unrelated dimensions; ee 
their actual practice in diagnosing patients does not seem to fo i 
their theoretical point of view. A patient is diagnosed, for instan e 
either as a schizophrenic or as an hysteric. If these two ager a ^ 
considered to lie on orthogonal axes, there is no reason W. Ge 
patient should not be diagnosed as suffering from both these 
orders, just as a patient might be an hysteric and intelligent, Ge 
schizophrenic and tall. This current procedure could only e 
defended if we assumed that psychotic and neurotic states 
qualitatively different from the normal, a belief held more ye 

quently with respect to psychotics than with respect to nenion 
It will be clear that the field is in a somewhat chaotic state tm 
that not only do different writers flatly contradict each other, bu 


that the same author at different times will follow different anc 


A Р "e ification iS 
often contradictory hypotheses. Again an empirical clarification 
essential. 


This chapter, then, is devoted to a description of the Kretsch- 
merian system and to a report on our experimental results, whic 
were sufficiently clear-cut to permit at least a preliminary ап 
provisional interpretation. The account of Kretschmer's system 
will be somewhat more detailed th H 
first glance; the reasons for this detailed description are threefold. 
In the first place, Kretschmer’s system is known to the Anglo- 
Saxon world almost entirely through the translation ofa very early 
edition of his book; the important revisions and the large amount 
of experimental evidence accumulated since by Kretschmer and 


his students is hardly ever mentioned or taken into account. This 
means, as the writer has shown elsewhere (1950¢, 1951), that critics 


pa t 
an may appear necessary à 


i » and to appreciate the important 
advances which he has made since, An up-to-date exposition of 


Kretschmer’s system, therefore, seems an absolute necessity for an 
adequate understanding of the problem involved. 

In the second place, Kretschmer has been dealing for a long 
time with methodological problems in the field of typology which 
are of great importance in connection with th 


| ta the general aim of this 
book and which have again hardly received the attention which 
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they deserve, although the writer does not believe that the methods 
which Kretschmer suggested and used in the 1920’s are as powerful 
and convincing as the factorial methods introduced since. They 
are of great interest and importance, nevertheless, and deserve 
More critical appreciation and discussion than they have received 
hitherto, 

In the third place, Kretschmer and his students have originated 
subsidiary hypotheses and have constructed tests to investigate 
these hypotheses, which appear to the writer far in advance of 
Most of the work being carried out in the field of personality 
research in other countries, using much more sophisticated statis- 
tical methods. Indeed, to anyone surveying personality research 
dispassionately, it will be only too obvious that there exists a rather 
igh negative correlation between, on the one hand, psychological 
Nsight, astute framing of hypotheses, and general understand- 
Ing of personality structure, and, on the other hand, statistical 
Sophistication, ability in the construction of experimental tests for 
the testing of hypotheses, and expert knowledge of methods of’ 
ascertaining validity and reliability. Kretschmer, although his 
approach is deficient in the elaborate statistics which have recently 
cen the vogue, nevertheless combines these two indispensable 
Sides of personality research to an unusual degree, and his work 
deserves to be better known than it is. It is not good for science 
that a stereotyped view of a man’s contribution should obscure 
aspects of that contribution which do not fit in with the stereotype. 
e tis in the hope of destroying certain erroneous notions and aiding 
4 better appreciation of Kretschmer's real contribution that the 
fivst part of this chapter has been written. 

Kretschmer's system is a typology, but it would be very wrong 
to imagine that his conception of 'type' is similar to that so 
frequently criticized in elementary text-books. In this simplified 
‘ind of presentation, the concept of ‘type’ is often contrasted with 
the Concept of ‘trait? and it is suggested that the ‘type’ approach 
implies a bimodal type of distribution, whereas the ‘trait’ approach 
implies a unimodal type of distribution. The argument is often 
advanced that as most human characteristics can be shown to vary 
unimodally, the ‘type’ approach must be erroneous. This argu- 
ment is, of course, fallacious, as the observed distribution of scores 
9n a test which has no rational metric underlying it has no neces- 
Sary connection with the true distribution of scores. All we have 
available with psychological tests are the observed distributions, 


i 
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of course, and to argue from those to the ‘true’ distributions 25 not 
admissible. fuit 
In any case, regardless of the validity of the argument Тү 3 
distribution, the concept of ‘type’ as being essentially bimo ge 
not in accord with Kretschmer's own definition (or, indecd, boat 
that of Jung or most other Continental typologists). Em 
makes his meaning quite clear. ‘The concept of type 15 the m "t 
important fundamental concept of all biology. Nature . . - does re 
work with sharp contrasts and precise definitions, which der 
from our own thought and our own need for comprehension. e 
nature, fluid. transitions are the rule, but it would not be we? e 
say that, in this infinite sea of fluid empirical forms, nothing © e? 
and objective could be seen; quite on the contrary. In wc? п 
fields, groupings arise which we encounter again and again; W * 
we study them objectively, we realize that we are dealing her e Wl с 
focal points of frequently occurring groups of characteristics, con 
centrations of correlated traits. . . . What is essential in biology: 
as in clinical medicine, is not a single correlation, but groups 0 
correlations; only those lead to the innermost connections. It 4 
daily experience in the field of typology, which can be deduce 
quite easily from the general theory, that in dealing with groups В 
characteristics one obtains higher correlations than with sin e 
characteristics. . . . What we call, mathematically, focal points o 
statistical correlations, we call, in more descriptive prose, con- 
stitutional types. The two are identical; it is only the point of view 
which differs. . . . A true type can be recognized by the fact that 
it leads to ever more connections of biological importance. Where 
there are many and ever new correlations with fundamental bio- 
logical factors, for instance, in the constitutional types dealt with 
here, we are dealing with focal points of the greatest importance. 
It will be seen that the conception of type which Kretschmer has 
claborated here is very similar to the one given by the writer, who 
regards types as ‘observed constellations or syndromes of traits 
and traits as ‘observed constellations of individual action tend- 
encies’ (1947). This use of the term type, which is not far removed 
from the conception discussed by Murphy and Jensen (1932), 
appears to be free from the defects which the concept of type is 
often alleged to possess. 
The particular system of correlations which Kretschmer chooses 
as a starting point lies in the constitutional field, i.e. in the field of 
body types, where he contrasts, as is well known, the pyknic and 


THE PSYCHOTIC DIMENSION 209 


the leptosomatic types, with a third one, the athletic, considered 
sometimes intermediate and sometimes definitely divergent from 
the other two. In addition, he has dysplastic types and a certain pro- 
portion of unassignable doubtfuls. While Kretschmer's approach 
15 not exactly along statistical lines, a number of factorial studies 
of intercorrelations between bodily dimensions, reviewed by the 
Writer (Eysenck, 1947), have shown that essentially there is a main 
dichotomy corresponding closely to his pyknic-leptosomatic type, 
and there can, therefore, be little doubt that this first step 1n his 
typology is eminently sound, although it may be suggested that 
greater statistical sophistication might have enabled him to purify 
his concepts rather more, and also to arrive at better indices of 
body build than those elaborated in his book. . 
The position of the athletic type raises important points here, 
à consideration of which cannot be deferred. Kretschmer and 
Enke (1936) consider the athletic type as being essentially different 
rom both the others rather than as being intermediate between 
them. This position appears unacceptable to us. In the first place, 
there is no independent statistical proof, such as only factor analysis 
could supply: in our own work we found no evidence of a factor of 
this kind (Eysenck, 1947). In the second place; in the large body 
of experimental work in which Kretschmer and his followers have 
tried to differentiate the three body types with respect to psycho- 
logical functions, it will be found in almost every case that the 
athletics are intermediate between pyknics and the leptosomatics, 
though somewhat closer to the latter. This suggests very strongly 
that they are not in a separate group, but are, in truth, inter- 
mediate, 'The attempt of Kretschmer and Enke to create a third 
type from the athletics often leads to rather disingenuous argu- 
ments. Finding usually that leptosomatics and athletics differ very 
little in their test results and are opposed to the pyknics, they have 
to argue that the causes which lead to the test results of lepto- 
somatics are different from those which lead to the test results 
of the athletics, although no evidence of any kind is given to sup- 
port these ad hoc arguments. The reader will be able to judge 
this point from the experiments quoted elsewhere (Eysenck, 
1951), and detailed data are given there for all three groups 
Separately. 
If the pyknic-leptosomatic typology is not only justified, but 
also fruitful, we should expect to find a large number of correla- 
tions of psychological importance. However, it may be noted that 
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there are also a large number of physiological correlations, par- 
ticularly in the field of autonomic functioning and of endocrinology, 
which deserve some mention; these have been summarized briefly 
by the writer elsewhere (Eysenck, 1950c, 1951). - 11 
Having laid his foundation in terms of body build, it 1s we 
known that Kretschmer proceeds to point out that there 1s close 
affinity between the manic-depressive type of insanity and pyknic 
body build on the one hand, and between schizophrenic disorders 
of all kinds and the leptosomatic (and to a smaller extent the 
athletic) type on the other. He also points out the particular 
affinity obtaining between athletic body build and epilepsy, barip 
himself, in part, on Dubitscher’s work with the Rorschach Tes 
(1932), who found among athletics reactions very similar to those 
found by Rorschach among epileptics. Westphal (1931) gives а 
table embodying over 8000 cases showing these relations fairly 
clearly (Table XXXIX). 


TABLE XXXIX 


В Schizophrenics: Manic Depressives: Epileptics: 

Воду Build pees cases 1361 em 1505 cases 
% % % 
Pyknic 13°7 64:6 55 
Athletic 16:9 6-7 28:9 
Leptosomatic 50:3 19:2 251 
Dysplastic 10:5 1I 29:5 
Doubtful 8-6 84 11:0 


'The writer has reviewed the literature with respect to these 
somatopsychic relations elsewhere (Eysenck, 1947) and will not 
repeat his conclusions; by and large, we may accept the main 
points made by Kretschmer—i.e. the prevalence of leptosomatic 
body build among schizophrenics and of pyknic body build among 
manic-depressives. It is known, however, that Kretschmer goes 
beyond the correlation of psychotic disorders and body types; he 
believes that schizophrenia and manic depressive insanity are 
merely extremes of contrasted psychological trait syndromes, which 
he calls the cyclothyme and the schizothyme, respectively. He 
holds that what is true of the extremes is also t 


rue, if to a lesser 
extent, of the less exaggerated, more normal members of each 


type. As he says: ‘Only when this viewpoint is Pursued into the 
field of normal psychology will we be able to appreciate. the 
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problem of constitution in its full importance. There is no jump 
in thus going over into normal psychology, but as we follow the 
threads between body build and psychological peculiarity from 
the psychotic, step by step, through all types of psychopathic 
Personality and get further and further away from those great 
mental disturbances which form the beginning of our investiga- 
tion—lo and behold— suddenly, we find ourselves among healthy 
People, among well-known faces. Here we recognize as familiar, 
normal features those traits which previously we had seen in cari- 
cature. We find the same types of facial structure, the same stigmata 
of bodily constitution, and we find that behind the same exterior 
dwell the same psychological forces.’ 

This general theory may, perhaps, be introduced by reference 
to Figure 26. This figure shows the distribution of the whole popula- 
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Figure 26 


tion in terms of a normal curve of distribution, ranging from one 
extreme (schizophrenia) to the other (manic-depressive insanity). 

l persons left of the mean would be schizothymic, meaning by 
that merely that their personality make-up has in common certain 
Clements which are grotesquely exaggerated in those psychotic 
patients whom we label schizophrenics, whereas all those to the 
right of the mean would be cyclothymics, meaning by that that 
their personality make up has in common certain elements which 
are Srotesquely exaggerated in manic-depressive patients. Persons 
Who are definitely abnormal but not yet psychotic Kretschmer 
Calls schizoid or cycloid respectively, whereas the large number of 
Persons in the centre of the distribution he calls syntonic, if they 
are on the cyclothymic side, and dystonic if they are on the schizo- 
thymic side. It is possible that Kretschmer would object to the use 
of a normal curve to depict the relation between schizothymes and 
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cyclothymes but little importance can be attributed in any case ч 
the form of distribution when the underlying metric is Demi 
just as in the case of the distribution of intelligence, the iw 
curve must be regarded merely as a convenient device rather t 

an accurate representation of actuality. tely 

We believe that in all essentials, Figure 26 brings out accura hat 
the main points of Kretschmer’s views. We sce, therefore, ee 
Kretschmer is suggesting a definite dimension of personality, EB 
we may call cyclothymia-schizothymia, but it would seem to fo ing 

from his writings that another dimension is also implied, rang t 
from normality to psychotit disorder and orthogonal to the Y» 
so that his theory can best be represented in terms of two p 
gonal axes, one measuring schizothymia-cyclothymia, the E ye 
normality-psychotic abnormality or ‘psychoticism’. Indeed, 1 her 
were to follow him faithfully, we would have to add two pe 
dimensions, namely, the diathetic and the psychasthetic sca’ 
In his view, cyclothymes vary among themselves on a scale ranging 
from humorous, vivacious, quick-witted, to the quiet, calm, n 
—the so-called diathetic scale; whereas schizothymes vary fro 2 
shy, nervous, sensitive, to dull, stupid, torpid—the so-called ру 
chasthetic scale. As, however, there is no experimental evidence Ae 
Kretschmer’s work regarding these scales, and as he makes lit 
use of them and does not define their relation to each other in any 
way, we have thought it better to simplify the problem by con 
centrating on his major hypothesis rather than on these subsidiary 
ones. The student of this problem, however, should keep them In 
mind. 

Attempts have been made by Kretschmer (1948), Van D e 
Horst (1916, 1924), Kibler (1925), and Zerbe (1929) to make direct 
tests of the association between mental type and physical type: 
assessing the former by means of interviews or questionnaires, and 
statistical tests applied to these figures by the writer have shown 
that the results are in accordance with the hypothesis at a high 
level of confidence (Eysenck, 1950). However, Kretschmer, quite 


rightly, lays most stress on a somewhat indirect procedure which 
is derived from Van der Horst. ‘There a 


z lways appeared a marked 
correlation between the normal leptosomatic subject and the schizo- 
phrenic patient on the one hand, and the normal pyknic subject 


and the manic-depressive subject on the other. This suggests a 
close relation between the psychological make-up of the lepto- 
somatic and the schizophrene and a firm concordance between 
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the Psychological structure of the pyknic and the manic-depres- 
sive? Put in other words, Van de Horst and Kretschmer try to 
show the general validity of the concept of schizothymia and cyclo- 
thyr nia beyond the psychotic realm by using body build as a 
tertium quid; their method of proof is to show that normal people of 
Sptosomatic or pyknic body build react to certain psychological 
©xperiencesin a manner similar to that of schizophrenes and manic- 
SPressives, who are known to be also leptosomatic or pyknic on 
the average. This is an ingenious method which deserved careful 
Consideration. Some of the results from it, particularly those of 
уап der Horst himself (1916, 1924.) and of Kibler (1925) are very 
impressive indeed. Fundamentally, however, it has certain weak- 
nesses which make it doubtful whether any definite conclusions 
can be derived in this way. As a methodology it follows only partly 
: © rules of the hypothetico-deductive method, as the results 
iced from the hypothesis are not stated in a rigorous enough 
in 10n to make proof or disproof possible. This is particularly so 
in View of the fact that the authors quoted do not state their results 
terms which are amenable to proper statistical treatment. How- 
ever, in spite of these criticisms, it should be realized that here we 
‘ave a method which could be made into an extremely powerful 
tool by slight changes in methodology and procedure, and when it 
15 realized that the work to be described was carried out at a time 
When the research genius of the rest of the world was still gazing 
Upon the Bernreuter Inventory as a non plus ultra of personality 
tests, it will be clear that here we are dealing with a serious effort 
to come to grips with a problem of fundamental importance at a 
time when its very existence was realized by very few other workers. 
We shall return to these methodological considerations later on 
and attempt to suggest an improved method for solving the very 
ifficult problem which Kretschmer set himself, by means of a 
modification of the factorial approach. Before doing that, however, 
We shall briefly summarize various researches falling under this 
general heading. As has been mentioned before, Kretschmer’s con- 
ception of type implies the discovery of certain correlations between 
traits, Thus, the schizothyme in his view would be found to be 
dissociative, form-reactive, high in personal tempo, better in motor 
Co-ordination, more perseverative, and with greater autonomic 
reactivity, It is the intercorrelation of these (as well as many other) 
traits which define the schizothyme pole of the type continuum as 
Opposed to the cyclothyme pole, which would be characterized by 
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traits showing him to be integrative, colour reactive, low in peso 
tempo, poorer in motor co-ordination, less perseverative, and Ze 55 
less autonomic reactivity. The experimental evidence for the : 
various assertions has been discussed at some length ie pes 
(Eysenck, 1950c, 1951); here we may take one of these traits 25 » 
example of the treatment which Kretschmer gives to these ud 
cepts. For this purpose, the trait which he calls ‘Spaltungsfaehig am 
has been chosen, i.e. a trait characterized at the one pole by alt 
ability to dissociate, at the other pole by a tendency to integr "s 
diverse mental content. As is the case with most of Keep: 
work, the hypothesis is derived originally from a close clini al 
study of large numbers of subjects. It is then given an operat ih 
definition by the elaboration of a number of objective tests ome 
are conceived to measure the particular trait hypothesized. T " 
measures are then applied to groups of schizophrenes and mam e 
depressives and normal leptosomatics and pyknics, and the SCH 
thesis is considered to be verified if observed differences lie in t í 
expected direction, at a reasonable level of statistical confidence. 

Kretschmer holds that the concept of dissociation (Spaltung) © 
of fundamental importance in understanding the mentality of re 
schizothyme, just as its opposite, integration, is important for di 
understanding of the cyclothyme mentality. His concept Be 
sociation goes further than that of Warren (1934): “The brea ES 
up of a combination of any sort into its constituents. He means а 
it, ‘The ability to form separate and partial groupings within t 
single act of consciousness: from this results the ability to ES, 
complex material into its constituent parts.’ This tendency towa 
dissociation characterizes the schizothyme and when exaggerate 
puts the schiz into schizophrenia! The absence of this ability tO 
dissociate leads to a concrete, synthetic way of looking at the 
mental content which characterizes the cyclothyme and, in exa£ 
geration, the manic-depressive. 

A brief description of some of the experiments performed by 
Kretschmer and his students to quantify this concept will make 
the discussion more realistic. 


(1) In a complex reaction-time experiment, it is possible to 


1 Kretschmer does not himself give calculations to assess the significance of 
his findings but these calculations have been made in connection with thc 
experiments discussed and are given in detail in the Paper already referred to 
(Eysenck, 1950c, 1951); here, we need only note that for each of the experiments 
to be described now statistical significance was adequate at the -or level. 


21. Speed of decision test. 


22. Dotting test. 
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measure the disturbing effect of various agents such as noise, a 
flashing light, etc. These distracting stimuli, according to the hypo- 
thesis, should lengthen the reaction-time of manic-depressives and 
of normal pyknics much more than that of schizophrenics or of 
normal leptosomatics, as the schizothyme group should be able to 

еер the two mental contents (experimental stimulus and disturb- 
ance) separate to a much greater extent than the cyclothymes, for 
Whom one should interfere much more with the other. 

(2) In another experiment, the subject has to remember the 
numbers of differently coloured squares on a card which is being 
Presented to him, one row at a time, the theory underlying the 
Experiment being that the schizothyme, with his dissociative ability, 
Would easily be able to carry in his mind the number of different 
Categories into which to classify these various coloured squares, 
and that he would be quicker and more accurate in the total task. 

(3) In this experiment, coloured groups of nonsense syllables 
are shown to the subject in a tachistoscope with instructions to 
Observe either the colours or the letters. He is later questioned about 

oth letters and colours. The hypothesis underlying the experiment 
Would require the schizothyme, with his higher abstractive ability, 
to be able to observe what is required and to pay no attention 
to other features of the stimulus, whereas the cyclothyme would 
remember more of what he was not asked to observe and less of 
what he was asked to observe. : 

(4) Long, unfamiliar words are presehted tachistoscopically, 
being shown ro times in succession so as to facilitate reading of the 
word, which could not be completed at one exposure. There are 
two ways of getting at the meaning of the word: (i) the abstractive, 
analytic, dissociative method, in which the total word is built up 

У reading successive letters and syllables and constructed as a 
whole from these parts, and (ii) the global, synthetic, integrative 
method in which a single impression is obtained and then elabor- 
ated in successive exposures. The first of these methods would 
be expected to characterize the schizothyme, the second the 
cyclothyme. 

(5) In this experiment, an attempt is made to study the effect 
of mental addition of twenty numbers on the regular rhythm of 
ergograph, work. It is hypothesized that schizothymes would be 
Capable more easily than cyclothymes to keep these two different 
tasks separate, so that they would give more correct answers in the 


one and have less disruption of the regular rhythm in the other task. 
8.8.Р. P 
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(6) On the Rorschach test we would expect schizothymes and 
cyclothymes to show marked differences when a сө r of 
made between the number of whole (W) versus the жы S| 
detailed (D, d) answers, an expectation borne out in actua ents 

These are only six of a much larger number of ке ing 
carried out by Kretschmer, but they do illustrate the pr 
hypothetical trait of dissociative ability, and as cach o i gi 
experiments has been shown to differentiate significantly ? dlor 
least one case between extreme groups on the er cip 
thymia continuum, it must be admitted that there is, at wag an 
evidence here which favours the Kretschmerian hypothesis. o pr 
unpublished factorial analysis involving the tests am 
well as an intelligence test and a measure of body type (the lity) 
Eysenck index of body-build described in Dimensions of sin. A , 
the writer was able to show that a general factor was present : ence 
these tests and in the measure of body type, even when intellig e 
had been partialled out. This experiment was only done on à pros 
number (N = 44) of unselected university students at the Kate? 
sity of Pennsylvania and until results from a replication, at ae be 
in progress with much larger numbers, are available, it apo 
considered anything but suggestive. However, it can be seen “ 
as far as experimental results go, they do not contradict but t€ 
to support the Kretschmerian hypothesis. ES 

What has been said of this particular trait appears to be tr il 
mutatis mutandis, of the other traits described by Kretschmer. me 
able evidence has been discussed in detail elsewhere (Eysen®™ 
19506, 1951); in the absence of factorial studies dealing with these 
traits, we cannot admit that they have been definitely shown (0 
exist and be measurable by means of the tests developed b y 
Kretschmer. Nevertheless, the consistent results obtained by him 
and other workers suggest that at least we have here an available 


source of hypotheses, theories, and well formulated research 
projects. 


ng, as does the present writer 


nstellation or Syndrome of traits', 
Kretschmer does not draw the obvious conclusion that we must 


first identify our traits and then isolate the type by means of the 
observed intercorrelation of traits; instead, he tries to prove the 
existence of the traits by arguing from the existence of a type. In 
other words, he shows that the one test of Perseveration dis- 


(1947), a type as ‘an observed co 
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criminates between cyclothymes and schizothymes and that 
another test of perseveration also discriminates in the same way; 
this he seems to regard as sufficient proof that both tests measure 
perseveration, which is, of course, a logical fallacy. The writer has 
shown that a test of suggestibility discriminates between neurotics 
and normals and that a test of persistence also discriminates in 
this fashion, and that accordingly the two tests correlate (Eysenck, 
1947). It does not follow from this that persistence is the same thing 
as suggestibility, and, indeed, the proof that the tests in question 
are tests of persistence or suggestibility has to bé given in terms of 
Separate factorial studies embodying different types of suggesti- 
bility and persistence tests. 

While Kretschmer places much importance on these traits, as 
We may, perhaps, call them in conformity with modern psycho- 
logical usage, the mainstay of his whole system, of course, is the 
cyclothymic-schizothymic dichotomy. We must criticize his 
method of proving that this dichotomy, whose existence in the 
Psychotic field few would deny, can and should be accepted in the 
normal field. A more direct method than the one used by him, 
Which relies on the possibility irrelevant tertium quid (the pyknic- 
Sptosomatic dichotomy) is required for this purpose, and is to be 
found in the method of criterioanalysis. Although Kretschmer 
shows that there is a correlation between body-build and the two 
main types of functional psychotic disorder, that correlation is not 
Very high, and the writer has shown elsewhere (Eysenck, 1947) 
that body-build is correlated with other variables (neuroticism, 
xtraversion-introversion) which are unrelated to the Kretsch- 
Merian concepts. If that be so, then clearly many of Kretsch- 
mer’s findings, which appear superficially to support his views, 
May in reality have quite a different explanation. To give just one 
example, Kretschmer has shown that leptosomatics are slow and 
accurate while pyknics are quick and inaccurate. This may be 
Interpreted in terms of his system; however, an alternative ex- 
Planation is also possible as we have shown- that hysterics tend 
to be more of a eurymorphic (pyknic) body-build than are 
dysthymics, who tend to be leptomorphic in body-build. It has also 
been shown that hysterics tend to be quick-and inaccurate, 
whereas dysthymics tend to be slow and accurate. It follows from 
this that there should be a correlation between body-build and a 
speed-accuracy test, but the interpretation of the experiment may 
be in terms quite different from those advanced by Kretschmer, 
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using concepts essentially alien to his system. It is because of a 
danger of indirect proof exemplified in this case that а 1007 
direct method of examining Kretschmer’s hypothesis become 
necessary. Let us turn then to a more direct test of Kretschmer $ 
hypothesis, using the method of criterion analysis. : ld 
In order to use the hypothetico-deductive method in this field, 
it is particularly important to state the hypotheses to be ang 
quite clearly, and to make deductions from them which can т 
tested empirically. As explained before, there are two m2! 
hypotheses involved in Kretschmer’s system: (1) The:functiona, 
psychoses (schizophrenia and manic-depressive insanity) are wi 
qualitatively different from normal mental states, but form 
extreme of a continuum which goes all the way from the perfec 4 
normal rational to the completely insane, psychotic individua 
All possible intermediate stages are represented on this СОЁ 
tinuum. (2) The two main functional psychoses (schizophren'? 
and manic-depressive insanity) show patterns of traits which p 
observable in non-psychotic persons also, although in 2 less 
extreme degree, and which give rise to a continuum running from 
the extreme schizothyme to the extreme cyclothyme, again ken 
all intermediate steps being represented on this continuum 
These continua are presumed to be orthogonal to each other. , 
A third hypothesis is frequently identified with Kretschmer 5 
System, but does not appear essential to it, although Kretschmer 
himself has made considerable use of it in his own attempt (0 
supply proof of his general system. This is the hypothesis that the 
schizothymic-cyclothymic continuum is correlated with body- 
build, schizothymes being leptosomatic, cyclothymes being pykn op 
with respect to their bodily habitus. The third hypothesis, being 
logically independent of the other two, will not be tested in the 
present experiment. 
We have attempted to state Kretschmer’s hypotheses in such 
a way that a statistical and experimental test of them becomes 
possible; while we believe that in stating them in this fashion We 
have not misrepresented him in any way, and while we believe 
us these views are held by him, explicitly or implicitly, it should 
€ borne in mind that in thus reducing a complex and difficult 
System to two brief fundamentals we may have done violence to 


this system. Whether this be so or not, the reader must decide for 
himself on the basis of Kretschmer's own writings, 


We may outline briefly how criterion analysis wil] be used in 
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Conjunction with the problem posed by Kretschmer’s hypothesis. 
Let the line AB in Figure 27 represent the normal-psychotic 
continuum, and let the line LU cut off at point X that part of the 
continuum containing mental states conventionally diagnosed 
Psychotic’ by psychiatrists. (The distribution of the total popula- 
ton has been tentatively included in the figure in the form of a 
normal curve of distribution. As the actual form of the distribution 
55 Irrelevant to the argument, which merely hypothesizes а con- 
tinuous distribution, any other form of rectilinear or curvilinear 
distribution might be substituted for the normal form without 
affecting the argument.) Let n objective psychological tests (a, b, 
S es n) be given to the two populations separated by the line 
LU, Le toa normal group and to a psychotic group. (The term 


A Я S 
NORMAL PSYCHOTIC 


Figure 27 


normal’ here means nothing but ‘not under psychiatric care for 
Mental disorders’; it does not imply anything more positive than 
*his absence of demonstrable and demonstrated mental disorder.) 
Let us assume that each of these z tests distinguishes significantly 
etween the normal and the psychotic groups. (In actual practice, 
n + x tests would have to be given in all, so that tests not dis- 
Unguishing at the chosen level of significance could be rejected.) 
Now let us divide the normal group into two parts, by making 

a cut at point L on the line AX; similarly, let us make a cut at 
Point M on the line XB, thus subdividing the psychotic group also. 
(Points L and M may be anywhere between A and X, and X and 
; respectively; there is no implication that they should divide 
the respective populations into equal halves.) On the hypothesis 
that AB represents a true continuum, we have now divided both 
the normal and the psychotic groups into two parts, one of them 
more normal, the other more psychotic. Group AL is more normal 
than LX; group XM is more normal than group MB. If AB is a 
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true continuum (the hypothesis to be tested), and if each of our 
п tests is related to that continuum in a linear fashion (supple 
mentary hypothesis), then it would follow that on these ? pes 
group AL would differ from group LX in the same way tha 
group AX differed originally from group XB. Similarly, 8100 
XM would differ from group MB in the same way that AL differe 
from LX. ir 

, We may put these arguments in a form which permits of tg 
being tested. If tests a and b differentiate significantly be ust 
groups AX and XB, then it would follow according to our hye 
thesis that they should differentiate also between AL and L4 
Similarly, they should differentiate between XM and MB. 24 
deduction, unfortunately, does not permit of any direct test, e 
there is no known method of determining points L and М, ап 
therefore of differentiating the groups under discussion. However: 
we can transform the concept of ‘differentiate’ into the concept Ё 
‘correlate’, and test our hypothesis in this fashion. If on tests а ап 
b the normal group does better than the psychotic group, a 
follow, as explained above, that group AL should do better e? 
both tests than group LX. Similarly, group XM should do better 
on both tests than group MB. But these statements are synonymous 
with saying that for both the normal and the psychotic groups separately, 
there should be a positive correlation between tests a and b. 

This argument can be extended to n tests, and implies as an 
consequence that tests which differentiate significantly between norma 3 
and psychotics should give positive intercorrelations when these correlations 
are run for the normal or the psychotic groups separately. Here then we 
would have a possible test of our hypothesis. But we can refine this 
fest a little by pointing out that not only should these correlations 
be positive, but also that they should be proportional to the powe” 
of each test to differentiate between the normal and psychotic 
groups originally. We may express this differentiating power in 
terms of a biserial correlation for each test with the normal- 
psychotic dichotomy, denoting the column of n correlations 
between tests and criterion the Criterion Column (С„ь). We may 
then say that the average of the intercorrelations of Gate with all 
the other tests in the battery, for either the normal or the psychotic 
group, should be proportional to the correlation of aes with 
the criterion, i.e. the normal-psychotic dichotomy. 
gita pesi Eugene, 

rrelations between 
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our п tests. Let us assume that we extract two factors from each 
of our two matrices, the sets of intercorrelations for the normal and 
the psychotic groups respectively, which we may call F, and F,’ 
for the psychotic and F, and F, for the normal group. It would 
follow from our hypothesis and our selection of tests that Е, and F, 
should be Proportional to each other, and to our Criterion Column, 
np: Here, then, we have reached the final and crucial test. We 
wee a hypothesis as to the existence of a general factor 
ying the pattern of variances and covariances of test per- 
oe of normal and psychotic groups, and deduced certain 
whic E which would follow if our hypothesis were true, and 
Зра = онн not follow on any tenable counter-hypothesis. In so 
thesis ese deductions are verified, we may consider our hypo- 
ie i as supported; in so far as these deductions are not verified, 
I ay consider our hypothesis as disproven. . 
Puer be clear that a similar procedure could be used with 
wold Ze our second hypothesis. What would be required there 
and m € a set ofn tests which discriminate between schizophrenic 
would anic-depressive patients; the Criterion Column in this case 
new c Consist of the biserial correlations of each test with this 
ined M. Apart from these changes, the procedure as out- 
to above is applicable to just the same extent as it is in respect 
: Our first hypothesis, The continuum in question is shown in 
lagrammatic form in Figure 26; on the basis of Kretschmer's 
to pothesis we would expect factors Е, and F,’ to be proportional 
each other, and also to the new Criterion Column, Сл. 
which ^ BYPotheses can be tested at the same time, using z tests 
the th differentiate significantly (using Fisher's F test) between 
ree groups involved, the normal, schizophrenic, and manic- 
€pressive. The two criterion columns, Can and Csa, could then 
* calculated quite easily by running biserial correlations between 
Cach of the л tests and the normal versus psychotic dichotomy for 
^» (using the combined schizophrenic and manic-depressive 
SrOups to form the psychotic group), and by running biserial 
Correlations between each of the п tests and the schizophrenic- 
manic-depressive dichotomy for Gg, Product-moment correla- 
tion would then be run between the z tests for the normal and the 
Psychotic groups separately, thus giving us two separate matrices. 
hese, when factor analysed according to either the centroid or 
the summation method, or indeed any of the current methods, 
should result in two factors each, F and F’, which would 
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а а ion 
be proportional to each other, and to the respective criterio 
columns. : nt 

This, in brief, is the method which has been used in the Geng, 

experiment. The experimental population consisted of ec? Ser 
subjects and roo psychotic subjects divided into two equa Gei А 
consisting of 50 manic-depressives and 50 ла Н endent 
together, 84 sets of scores were derived from over gom нА great 
objective tests given to these three groups, of which ie scores 
majority were significant when the F test was applied to Нави 
of these three groups. The tests used included four tests o olour- 
the Crown Word Connection List, a work-curve test, a C level 
form test, a mirror-drawing test, a social-attitude test, ante 
of aspiration tests, a concentration test, 11 tests of exp sation 
movements, two tests of perseveration, one test of mersit 
(Spaltungsfaehigkeit), one test of tapping, four tests of 0591 test; 
one speed-accuracy test, one suggestibility test, one e и їп 
and one persistence test. Results of these various tests аге ri 
detail elsewhere (Eysenck, 1951). The results may be summa first 
verbally, drawing attention to the following points. In the that 
place, the large number of significant differences found sano wee? 
the three groups tested are very unlikely to have come geg 
single universe; there is good reason to believe that the sele ing 
procedures have been successful in giving us subjects cf? by 
much greater heterogeneity than would have been expecté 
chance. the 

In the second place, it appears that on the majority of tests, hà 
schizophrenic group scores somewhere between the normal and de 
manic-depressive group. Out of 38 test results significant at t 
"I per cent level, only six show exceptions to this general tendency: 
In other words, it would appear as if we were dealing with one 
continuum, ranging from normal through schizophrenic to manic- 
depressive. This point will be taken up again later, when further 
data are being adduced to throw some light on it. 

In the third place, it is clear even from a casual inspection of 
the data that the variances in the psychotic groups are consider- 
ably larger than those їп the normal groups; there is also 4 
tendency, although less strongly marked, for variances in the 
manic-depressive group to be larger than in the schizophrenic 
group. 

1 Further details regarding population and other fe. 


atures of the experi- 
mental group will be found elsewhere (Eysenck, 1951). 
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In the fourth place, it is interesting to note that those tests 
which had in past work been shown to have high correlations with 
neuroticism, or to discriminate well between normals and 
neurotics, do not show even a tendency to discriminate between 
normals and psychotics. We might mention here such tests as 
Persistence, body sway (with and without suggestibility), per- 
Severation, work curve inversions, and the word connection list. 

In the fifth place, it is clear that neither the colour-form test 
Dor the dissociation test gives results which support Kretschmer's 
Dese, There are no significant differences with respect to 
such Eus and the actual figures are not even suggestive of any 

ifferences as he predicts. It is possible that the tests used 

ua unsuitable for a relatively high-grade group of subjects, 
results at with tests better suited to the level more significant 
Кайа oe have been obtained. While this argument cannot be 
results, » there is no evidence to favour such speculation in our 
— sixth place, it is apparent that there are a number of 
kion ality traits which distinguish the normal from the psychotic 
чад at or below the 1 per cent level. We may perhaps briefly 
ама ре these, emphasizing of course that the terms used here 
upo be understood to bear the operational connotation imposed 
Ke: them by the actual tests used. We find then that psychotics 
ess, fluent, perform poorly in continuous addition, perform 
ME in mirror drawing, show slower oscillation on the reversal 
Ke test, are slower in tracing with a stylus, are more 
i S with respect to social attitudes, show poorer concen- 

On, have a poorer memory, tend to make larger movements 
and to Overestimate distances and scores, tend to read more 
slowly, to tap more slowly and to show levels of aspiration much 
less reality adapted (Eysenck, 1950, 1951). Those who would 
argue that of course psychotics are poorer in all tasks than are 
Normals, and that consequently these results are hardly surprising, 
will have to explain why no such differences were observed with 
Tespect to the tests enumerated under point four above. 

In the seventh place, no differences were observed with respect 
to a number of tests other than those mentioned under number 
four. These are: social attitudes, such as radicalism, time of writ- 
ing, number of lines drawn. Occasionally, tests under the same 
Beneral classification gave conflicting results; thus, under the head- 
ing of ‘Fluency’, tests ‘Flowers’ and ‘Words’ showed no significant 
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differences, while tests ‘Birds’ and ‘Animals’ did. Similarly, when 
both the best and the worst of a series of scores on a test were 
scored separately, different results were sometimes obtained. As 
an example, we may select test 7, ‘Concentration’, where two sub- 
tests are used, ‘Letters’ and ‘Numbers’, each consisting of eight 
separate scores which are added to give total scores. For both 
‘Letters’ and ‘Numbers’, the lowest scores do not give significant 
differences, while the highest scores give significant differences. 
This phenomenon could with advantage be studied separately; 15 
implications are, presumably, that while normals often do as 
poorly as psychotics on a single trial, psychotics hardly ever do 
as well as normals, even on a single trial. 

This completes our brief summary of the isolated results of the 
experiment, and we must turn now to the study of the patterns of 
intercorrelations by means of criterion analysis. е 

In view of the fact that not all the tests used gave highly 
significant results, and as many of the scores used were not experi- 
mentally independent of each other, twenty tests in all were 
selected for the factorial study. These tests are given in Table XL. 
Also given in this table are the biserial correlations of each test 


TABLE XL "e 
1 I ш IV 
Ted Toe | Tad Top 

1. Overestimation in distance judgment — 160 | —-325 150 | —:233 
2. Fluency—animals 323 203 189 "293 
3. Social attitudes—zero responses —:189 | —:255 1070 | —:22? 
4. Reading prose—time іп sccs. 364 +332 +031 "335 
5. Three circles—time in secs. 322 254 "140 -268 
6. Concentration—numbers "274 :279 | —:037 278 
7. Tapping—15 secs. +197 :280 | —-081 "244 
8. Mirror drawing—J scores, average —:864 | —:237 | —:228 | —:30! 
9. Perspective reversal —slow +290 +089 208 | —:199 
10. Perspective reversal —normal *193 :224 | —-016 :229 
11. Abstractions—letters remembered 379 “195 +207 “gor 
12. Three squares—diameter —:897 | —-271 | —-o41 | —:322 
13. Tracing test—J scores, average —:274 | —:265 1016 | —:257 
14. Work curve—lowest score 376 +262 +129 +309 
15. Size estimation—half-crown —:253 | —-332 106 | —-294 
16. Numbers 1-20; length of writing +250 063 -216 -162 
17. Expressive movements—length of waves —459 | —213 | —:279 | —:341 
18. Expressive movements—amplitude —:556 | —-242 —-339 | —-402 
19. YEAR—length of writing —-353 | —-228 —-097 | —-275 
20. Suggestibility —:005 | —-o42 -039 | —-025 
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with the following dichotomies: (1) Controls versus depressives; 
(2) controls versus schizophrenics; (3) schizophrenics versus 
depressives (С, а); (4) controls versus psychotics (C,,). The 
product-moment intercorrelations for these twenty tests were 
calculated for the normal (control) group, and separately for the 
psychotic group. The tables are given in Eysenck (1951). 

These two tables were factor analysed by means of Thur- 
stone’s centroid method. Thurstone’s method of sign reversal was 
used until all the column sums were positive; this was followed by 
reflection of pairs of columns in accordance with Holley’s criterion 
(1947). Two factors were extracted from each of the tables; this 
number was decided by means of the following method. (1) A one- 
factor solution was assumed and iterations made until the com- 
munalities from the last two iterations were all within 2005, 
When the first factor residuals were calculated. 28 and 22 of these 
Gang above the о5 level of significance in the two tables, where 
mud have been expected by chance; 12 and 8 respectively 
Was чан the ‘Or level, where 4 would have been expected. It 
solution etore decided to extract a second factor. (2) A two-factor 
agreed Was assumed and iterations carried out until the last two 
culat аш + -oor. The second-factor residuals were then cal- 
si "E, and in both tables 8 and 2 respectively were found to be 
Enificant at the -o5 and -or levels, where 19 and 4 would have 
cen expected by chance. The analysis was terminated at this 
port: Factor saturations are given in Table XLI together with the 

ommunalities, 
thesi € now have available the data with which to test our hypo- 
the Sis. As shown in section one, Kretschmer’s hypothesis requires 
at the two sets of factor saturations should be proportional to 
“ach other. We must therefore correlate Е, with Fp, and F,’ with 
Fp ` the hypothesis requires that these two correlations should be 
Significantly positive. In actual fact they are + :868 and + -746; 
it appears therefore that this deduction is borne out by the facts. 
Oth factor patterns are remarkably alike. 

The next deduction to be tested relates to our Hypothesis I, 
and states that F, and F, should both be proportional to C, „. The 
respective correlations are + :895 and + -954. Again we find our 
deduction verified; and it would appear that psychotic states do in 
fact form a continuum with normal mental states. Our last deduc- 
tion relates to Hypothesis II, and requires that F,’ and Е, should 
both be proportional to Ca The correlations in question are 
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TABLE XLI 
-Normal Group Psychotic Group 
Е, Е; D F, Ej ћу? 
I 043 -184 -036 “170 +044 "091 
2| —-202 342 158 —:379 361 214 
3 119 +048 :016 —-o60 +183 7037 
4 498 | —-198 -287 -363 —'165 1159 
5 "383 7052 "149 -396 +196 195 
6 | —-334 “705 -609 —:375 432 327 
7 —:176 157 +056 —-290 +226 "135 
8 "355 — 055 129 :276 —:330 185 
9 —:265 —:215 +115 —:973 —-228 191 
10 "152 —:129 одо —:373 — 064 143 
11 —'941 `391 "211 — 448 -170 289 
12 :623 417 +562 174 171 0 
13 329 — :053 “IHI 425 "124 k 
14 — +428 -260 251 —:497 +142 +267 
15 084 -288 -0g0 -244 -121 :074 
16 063 508 262 183 357 161 
17 "240 “156 :082 “509 156 283 
18 KI 1054 "102 -537 +120 808 
19 7107 "169 +040 +202 579 376 
20 +007 —-o60 +004 — +034. —-099 "Ort 


+ `029 amd +--085, and it will be clear that in this case the 
deduction is not verified. It would appear to follow that schizo- 
thymia-cyclothymia does not exist as a separate dimension 0 
personality.2 

Before accepting this negative conclusion, however, it would 
appear desirable to study the possible effects which rotation might 
have on the emergence of a factor of the kind we are looking for. 
As has been pointed out in the original paper on criterion analysis 
(1950), one feature of this method is the rotation of factors, not into 
simple structure as in Thurstone’s system, but into maximum 
correlation with the criterion column. As we have two criterion 
columns in the present study, two separate and different rotations 
are possible for each of the two matrices: (1) rotation such that 

1 No attempt is made here to interp: 


could be speculative at best, and could serve little useful purpose. The possibility 
that this second factor may be related to the personality dimension extraver- 
sion-introversion (Eysenck, 1950c, 1951) has been considered, but until the tests 
most highly saturated with factor two have been included 
with other tests known to measure E-I no such view could 
any confidence. 


ret Fn’ and Ер. Such interpretation 


in the same battery 
be put forward with 
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а vector is found which, in the two-factor space, coincides as 
nearly as possible with the projection of the C, ,, column on that 
"Pace, and (2) rotation such that a vector is found which, in the 
two-factor Space, coincides as nearly as possible with the pro- 
Jection of the Се column on that space. In either case, the second 
factor is calculated by keeping it orthogonal to the factor first 
extracted. When these rotations are carried out the following 
results are obtained: а 
Correlations between Ё„ and f, апа Ё, апа F,’ аге now 
+ Bop and + 7769 respectively. Correlations of F,, and Е, with 
пр ате now + -894 and + ‘951 respectively, while correlations 
of P, and E, with С.а are still quite insignificant. Rotations 
^w Ording to ( 2) are rather more interesting. Correlations between 
D and (F,) and (F^) and (Е) are now + :830 and + :793 
. i Pectively. Correlations of (Fa) and (F,) with C,; are now 
` 208 and + ‘679 which seemingly indicates that here we do 
E Some justification for speaking of a schizothymia-cyclothymia 
Sam But this putative factor can be shown to have no real 
meaning when we compute the correlation between C, , and Cé 
Which turns out to be + *559. This is simply a confirmation ofa 
Point made before, viz. that differences between schizophrenics 
= manic-depressives tend to be in the same direction as differ- 
enea S between normals and schizophrenics, and that therefore the 
schizophrenic group tends to be intermediate between the others. 
In other words, the correlations of (F,) and (F,) with С. appear 
Че entirely to the correlation of Csa with С, „, and again we find 
no evidence whatever for the existence of a schizothymia-cyclo- 
thymia factor, 

This Conclusion, like others in this chapter, should not be 
taken as in any way definitive. It is possible that a selection of tests 
Which Save more scope to schizophrenic-depressive differences 
might Produce results more in line with Kretschmer's hypothesis, 
although the fact that tests of his which were included did not 
Succeed in producing this discrimination makes it somewhat un- 
likely that very different results would be reached with a different 
Selection of tests. It is perhaps significant that wherever results in 
this experiment are positive, they are very decidedly so, and where 
they are negative, they are equally decidedly negative; the support 
for the first hypothesis investigated, and the failure to find any 
Support for the second hypothesis, are equally impressive in their 
decisiveness. 
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We have shown, so far, that neither neurotics nor pho 
are something sui generis, qualitatively different from пө 
people; instead, we have been able to show that there we E 
‘neuroticism’ continuum linking normals with neurotics, апі б 
‘psychoticism’ continvum, linking normals with psychotics: |, ог 
question immediately arises: Are these two continua identica en 
are they independent of each other? Identity would presum? ci 
be assumed by the Freudian theory of ‘regression’, according 5 
which there is a continuum ranging from the normal, well-adap the 
personality, through the neurotic, partly regressed patient, e us 
psychotic, almost completely regressed. Independence woul 3 
assumed by most of the psychiatric textbooks written 2 i s 
Kraepelinian lines. The problem is clearly an important put 
indeed it is fundamental for any useful taxonomic system. Теме 
по doubt that it is eminently amenable to a systematic expe! 
mental attack. А ке 

There are certain fairly obvious clinical findings which ma f 
the assumption of one common continuum somewhat unlikely: 
we were dealing with one single continuum, normals developing 
a psychosis should pass through a state of neurotic disorder firs а 
similarly, neurotics should be far more susceptible to the develop 3 
ment of a psychotic illness than normals. Neither of these dein, 
tions appears to be verified by psychiatric observations. Somewhé a 
more definite are certain findings from military experience. There 
is a clear, monotonic rise in neurotic breakdown rate with increas 
ing length of military service; there is no such rise in psychotic 
breakdown rate. There is a clear drop in neurotic breakdown after 
the cessation of hostilities, of between бо and уо per cent; there " 
no change in psychotic breakdown at all.1 Clearly, neurotic ап 
psychotic breakdown respond quite differently to environmenta 
stress, a fact which makes it difficult to maintain the ‘single 
dimension’ hypothesis. 

These data, derived as they are from observation rather than 
from experiment, support but do not prove the hypothesis that 
here we are dealing with two dimensions rather than one. The rest 
of this chapter is devoted to a somewhat more formal proof of this 
proposition. Let us first of all consider the type of deduction which 
may be made from the two hypotheses between which we are 
trying to decide. On the ‘single continuum’ hypothesis, we аге 


1] am indebted to W. A. Hunt for pointing out the facts to me, and for 
allowing me to consult the figures on which the above conclusions are based. 
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dealing with three groups—normals, neurotics, psychotics—whose 
mean ‘regression’ scores lie in that order along a single continuum 
or dimension. We would, therefore, expect that those tests which 
discriminate significantly between normals and neurotics should 
also discriminate with even greater significance between normals 
and psychotics. We have already seen that this is not so. Tests like 
Suggestibility, static ataxia, persistence, and the Word Connection 

'st discriminate between normals and neurotics at a high level of 
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WORD CONNECTION LIST 


. Figure 28 

Significance, but fail completely to discriminate between normals 
and psychotics, Other tests, like the large selection of expressive 
Movement tests described in Eysenck (1950¢, 1951), discriminate 
at a high level of confidence between normals and psychotics, but 
fail to discriminate between normals and neurotics. These results 
have already been mentioned in passing. It may be more 05 
lightening to present comparisons based on a few selected tests in 
diagrammatic form. 

Figure 28 shows a plot in which the ordinate and the abscissa 
respectively represent scores on two tests (the Word Connection 
List and a test of Length Estimation, scored for amount of over- 
estimation), the first of which discriminated very significantly 
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between normal and neurotic groups, while the весойй dr 
criminated very significantly between normal and psyc ger 
groups. Plotted in the diagram are the mean scores of two ч (опе 
groups who were given the two tests, two psychotic group cote 
manic-depressive, the other schizophrenic), and three P fering 
groups (one hysteric, one psychopathic, and the third su re 
from anxiety). On the basis of the ‘single continuum ДУРУ еч 
we would expect these means to lie оп a more ог less stra1g 
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Figure 29 


with the normal groups at one end, the psychotic groups at the 
other, and the neurotic groups intermediate. It will be seen that 
the facts do not bear out the hypothesis. It is impossible to regar 

the results as compatible with a single-dimension hypothesis; W€ 
need at least two dimensions to accommodate the triangular 
normal-psychotic-neurotic grouping. This is even more apparent 
in Figure 29, where the average results are plotted for the normals, 
psychotics, and neurotics respectively, using total groups not split 
up into their constituent parts. Figures 30 and зт bear out this 
impression; using two other sets of tests (suggestibility and an 
expressive movement test described in Eysenck (1950c) in the one, 
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ther) we find that in 


and tapping rate and static ataxia in the О 
d in a linear form, 


neither case can the three means be arrange 
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suggestive of one underlying dimension or continuum; 
a two-dimensional space is required. 
These fairly clear-cut examples do not ofc 
S.S.P. 


in each case 


ourse constitute a 
ә 
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formal proof; no effort has been made to apply formule to m 
results and show the statistical significance of the deviations иа, : 
a straight line. As a demonstration ad oculos, these diagra - 
appeared sufficiently striking to demonstrate the logic of the = à 
ment underlying our proof; for the proof itself we must turn t " 
somewhat more formal demonstration. Two such imer e 
will in fact be given, the first more definitely experimenta", 

second more statistical in naturc. F 

The experimental study alluded to was carried out T 
Freeman (1951), using the Character Interpretation test D ре 
trated in photographs 29 and зо. The nature of the test м1 : 
apparent from these reproductions. Twelve photographs of Била 
faces—male and female, adult and child—are given to the subjec d 
together with a set of 24 adjectives for each. The various sets e 
adjectives are divided into two lists of 12 each, called List A ës 
List B; instructions are to put a check mark beside the four мої 
in List A which best describe the person shown in the picture, ап ^ 
then to put a cross beside the four words in List A which leas 
describe the person. The same is then done for the adjectives 11 
List B. 

The twelve adjectives in each List are selected according "e 
Scheme which is identical for all the pictures. Two adjectives 
relate to feelings of (a) hostility, two to feelings of (b) self-impor™ 
ance, two to (c) immorality (other than sexual), two to (d) sexua 
immorality, two to (е) depressive feelings, and two to (f) fearful- 
ness and insecurity. One of the two terms in each case is worded 
positively, the other negatively; ‘virtuous’ and ‘loose’ would thus 
form a contrasting pair of words in the sexual category, ‘uneasy 
and ‘undisturbed’ would form a similar pair in the fearful-insecure 


category. The adjectives in each case were chosen so as to have 


some apparent relevance to the Particular picture in question. 
Negative (unfavourable) reactions (either putting a check mark 
beside an unfavourable adjective, 


+ : or putting a cross beside а 
favourable one) are given higher scores than positive (favourable) 
reactions (either putting a cross beside an unfavourable reaction, 
or putting a check mark beside a favourable one). 

Scoring generally is done on several levels. At the lowest level, 
each adjective is scored separately. At the second level, the two 
adjectives forming a pair in a particular list are combined At the 
third level, corresponding pairs in ч 


Lists А and В ar bined: 
At the fourth level, all categories for a picture rs сога, 
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giving a general index of liking or disliking for that picture. And 
at the highest level, all these indices are combined in a grand total, 
expressing an individual’s liking or disliking towards all the pictures. 

The general hypothesis underlying this test will be apparent 
from the description. Differences between subjects in assigning 
character traits to the people whose pictures constitute the test 
may be presumed to be related to the personalities of the subjects 
themselves; if that be true, it would follow that the scores of men 
would differ from the scores of women, or that the scores of 
neurotics would differ from those of psychotics, and those of both 
groups from the scores of normal subjects. 

These conclusions would only follow, of course, if our system of 
scoring possessed a certain amount of reliability; like any objective 
scoring system for what is essentially a projective test (properly so 
called in this instance, as the mental mechanism hypothesized is 
that of projection) the one employed here may be putting together 
on a priori grounds scores which psychologically are quite un- 
related. Reliabilities were calculated at various levels of combina- 
tion, and are sufficiently high to show that such a criticism would 
not be entirely justified. Reliabilities at the third level (List A 
versus List B for pairs of adjectives) for 100 normal male subjects 
average around :40. At a higher level, we compare total ‘like’ 
score for a particular category in List A for a given picture with 
total ‘like’ score for a particular category in List B for the same 
picture. Below are given the reliabilities, averaged over all 12 
pictures, for 100 normal, 50 psychotic, and 50 neurotic subjects, 
all of them male. The values are: -67, :68, and -67. Considering 
the small number of choices which makes up each score, these 
reliabilities are rather encouraging. Neither age nor intelligence 
were found to be correlated with the scores. 

We may now return to the main reason why we introduced 
a discussion of the Character Interpretation test into this chapter. 
If our assumption is justified that normal, psychotic and neurotic 
subjects will react differently to the stimuli provided, and that 
their reactions are the product of fundamental personality traits, 
then it would follow that on the Freudian hypothesis of a single 
normal-neurotic-psychotic dimension the psychotic deviations 
from the normal pattern should be in the same direction as the 
neurotic deviations, but more strongly marked. If no such general- 
ization could be derived from the data, then the ‘single con- 
tinuum’ hypothesis would appear to be invalidated. 
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In order to avoid the necessity of quoting the pw 
lengthy and complex material relating to all the p E in 
test, we shall give details only for the two pictures repro -— s 
this book as photographs 29 and 30. The pictures repres "e 
male negro and a white woman respectively. In the case S d B, 
negro, the scores for categories a to f, summed over Lists A Е K 
are as follows: 26:56 for 100 normal male subjects; gi three 
male psychotics; and 28-84 for 50 male neurotics. Two of the very 
differences are significant (normals versus psychotics) d that 
significant (psychotics versus neurotics), and it will be notice domi 
neurotics and psychotics differ from normals in opposite pea Ce 
the psychotics give more ‘like’, the neurotics more disli i the 
sponses than the normals. On the other picture, the scores О the 
same three groups are respectively 30:26, 25:52, and Tot 
normal-psychotic and the psychotic-neurotic differences are slike’ 
very significant. In this case the psychotics again have higher any 
scores than the other two groups, but the neurotics are not in 
way differentiated from the normals. here 

Results from these pictures are typical of all the others: t hor 
is no tendency for results to fall into the normal-neurotic-peyP 
tic arrangement called for by the Freudian hypothesis. Ot y 
Sequences, such as neurotic-normal-psychotic, or normal-psychot" e 
neurotic, appear much more frequently, thus suggesting that 
are dealing with more than a single continuum. б 

At the conclusion of the test, the subjects were asked to endors 
a similar set of adjectives as applied to themselves, first as seen БУ 
the person whose photograph they liked best, and then as see? 
by the person whose photograph they liked least. The actual instruc- 
tions for this test are reproduced on the opposite page. The соп" 
struction of the sets of adjectives, and the method of scoring, аге 
the same for these two tests as they were for the twelve picture 
test items. On the first test (adjectives marked as by the person 
liked best) the sequence of scores is normals (12:58), psychotics 
(15:04), and neurotics (17-80); all these differences are significant 
or very significant. Normals, apparently, consider themselves liked 


best, neurotics consider themselves liked least, with psychotics in 
between. On the second test (adjectives marked as by the person 
liked least) the sequence of scores is normals (21-06), psychotics 
(21-32), and neurotics (24-84); neurotics differ significantly from 
both normals and psychotics. Here again, normals consider them- 
selves liked best, neurotics consider themselves liked least, with 
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а гі the 
psychotics in between. These results also fail Eo Oe eet 
‘single continuum’ hypothesis, at a high level of statis 
cance. T iented 

We must now turn to a somewhat more pou Pec 
study. The method used in it has already been allu S showe 
discussion of the work of Rao and Slater, in which t ad SE s, 
that psychiatric ratings of hysterics, psychopaths, Se explaine 
anxiety states, and normals on thirteen points could uroticism)s 
entirely ih terms of one dimension (identifiable with ne in severity 
without requiring any other continuum than differences he groups: 
of neurosis to account for diagnostic differences between ewe by 
The method to be used has been developed indepen based ОП 
several workers including C. R. Rao (1948) and 38 similar- 
Fisher’s maximum likelihood principle; it bas Seef) (1947) 
ities with the independently developed solution of Penros of dis- 
and Smith (1947), and falls under the general Ae, e 
criminant function analysis. Rao & Slater (1949) essentia da z 
certain linear discriminant functions called ‘canonical’ to E rious 
geometrical representation of the differences between GR the 
neurotic sub-classifications; their method requires an answer nt to 
dimensional problem—how many dimensions are suflicie nifi- 
represent all the groups measured?—and supplies a test of ed ce 
cance which is unfortunately lacking in the traditional fac tio 
approach to this question. Lubin (1950) has put Rao's me kers 
into a form in which it can more easily be understood by wor on 
in the Psychological field, and has also carried out the analysis 
which this section is based 3 Me 

We have already shown that both ‘neuroticism’ and рүш" 
ism’ are continuous variables, ranging all the way from ve 
extremely well adjusted, mature, stable type of personality to t1 
extremes of neurotic or psychotic abnormality. Yet for certain 


purposes it is desirable to classify individuals into groups—nor mal, 
psychotic or neurotic—for 


various social and administrative 
reasons. A person has to be certi 


rejected from the army because of 
neurotic disability; this again is an all-or-none decision. Diagnosis 
as neurotic or psychotic may determine the course of treatment— 
the neurotic may receive psyc 


hotherapy, the Psychotic may receive 

1 Тһе psychological interpretation of Lubin’s results given on the pages 

below was made by the writer; Lubin’s thesis is concerned only with the 
statistical issues involved. 
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E.C.T., or be subjected to leucotomy. In a similar way, although 
intelligence is generally considered to be a continuous variable, 
the decision to label a person as ‘mentally defective’ is in essence 
based on a system of classification in which everyone is either 
‘normal’ or ‘subnormal’. Categorical decisions are essential 
in practice; it is the task of science to aid in the correct 
Choice of categories, and to minimize the number of misclassifi- 
cations. 

The best available method for classifying individuals into 
mutually exclusive categories on the basis of quantitative scores is 
Fisher’s discriminant function (1936, 1938). It is applied in the 
Present case to three groups of subjects—50 normals, 50 psychotics, 
and 50 neurotics—of roughly equal age and sex distribution. None 
of the psychotics were certified, and none were of the chronic, 
deteriorated type. The normals were chosen on the usual negative 
criterion of not being to our knowledge under treatment for mental 
disorders of any kind. It is almost certain that the criterion is far 
from perfect, and that if the whole group of 150 subjects were to 
be seen and diagnosed by a psychiatrist ignorant of their previous 
diagnosis, a correlation of less than unity would be found. This 
fact will seriously attenuate our results, but is unlikely to lead to 
exaggerated claims as to the value of the method. Four tests of 
manual dexterity were given to all the subjects, the tests chosen 
being tests M, N, O, and P from the General Aptitude Battery of 
the United States Employment Service, whose kindness in allow- 
ing us to use their tests is acknowledged with gratitude. (The tests 
are illustrated in photographs 19, 20, 23.) Again, the choice of 
tests was such that our findings would constitute a minimum 
rather than a maximum claim; it is well known that the higher 
the intercorrelations between tests, the less likely are they to make 
independent contribution to a multiple selection problem. It will 
be clear that if normals, psychotics, and neurotics can be dis- 
criminated successfully on the basis of four manual dexterity tests, 
then much better discrimination could be achieved on the basis 
ofa larger number of more diversified tests. However, we are here 
concerned more with the method than with the results, and even 
with the restrictions imposed on the solution by the choice of tests 
and subjects, results will be shown to be markedly positive and in 
accord with our hypothesis. 

The first step in the procedure requires the condensation of the 
four tests into the two canonical variates that will best differentiate 
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the three groups in two dimensions;1 the second ‘step requires ves 
application of Rao's likelihood solution for classification. Table 
XLII shows the results of the first step; in it are given the means, 
variances, and correlation ratios of the two canonical variates, Y, 
and Ү,. It will be seen that Y, has a correlation of -64 with the 
trichotomous classification, while Y, has a correlation of 45 with 
it. The first of these is the largest possible correlation ratio of any 
linear combination of the four tests with the criterion, while the 
second is the largest possible correlation ration of any linear com- 
bination of the four tests which at the same time correlates zero 


with Y,. In combination, Y, and Y, correlate with the criterion 
to the extent of -78. 


TABLE XLII 
Ne ; ` Correlation 
‘ormals Neurotics Psychotics Ratio 
355 78:240 66:540 59:420 64 
Y, 40:660 48:560 39:280 45 
S; 79166 | 55192 | 136167 
5,2 56:596 67:639 64:047 


Before calculating Y, and Y. 
roots for significance. It is alw. 
means in m-1 dimensions ` befo 
however, (two in this case) w 
lie along one dimension, or c 
we must show (1) that there ar 
the groups, and (2) what th 
Which is required to represent the differe 
means. It is the possibility of such a test of si 
tutes the main claim of this 


Bartlett’s (1938) test, both roots are significant at the P = 001 
level. We may therefore conclude that two dimensions are neces- 
sary and sufficient to account for our observed test data. Inter- 
pretation of Y, and Y, will be attempted later on; here we may 


2, We must of course test the latent 
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point out simply that inspection of the values in Table XLII sug- 
gest that Y, is identical with our ‘psychoticism’ factor, and Y, with 
our ‘neuroticism’ factor. | 

If we wish to go on and assign our 150 subjects to their appro- 
priate class on the basis of their test scores, we must transform the 
canonical variates, Y, and Ү,, into likelihood functions of the Rao 
type. This procedure, which will minimize the proportion of mis- 
classifications, consists of calculating the likelihood for each 
individual of belonging to each of the three groups, and then 
assigning him to that group for which he has the maximum likeli- 
hood. Table XLIII gives the results: it will be seen that 71 per 


TABLE XLIII 


Psychotic Neurotic Normal Total 


Predicted Psychotic 34 9 3 46 
Group Neurotic 9 32 6 47 
Membership | Normal 7 9 4t 57 
150 
Percentage correctly classified: 68% 64% 82% 71395 
I ! 


cent of all cases are correctly classified.! A non-parametric test of 
significance (Lubin, 1950) applied to this table shows that mis- 
classifications are very significantly fewer than chance (P < соот). 

As was pointed out earlier, the quadratic likelihood solution 
using the canonical variate scores is not the only one possible. We 
may use a linear rather than a quadratic solution, and we may 
use the original test scores rather than the canonical variates based 
on them. These various methods give rather similar results. Results 
of the various methods are compared in Table XLIV. 

A visual presentation of the data is given in Figure 32. The 
two canonical variates, Y, and Ү,, constitute the ordinate and the 
abscissa: the mean values of the 150 subjects are plotted accordin 
to their scores on Y, and Y, The three lines which split the 
diagram into three parts are the discriminant function lines, giving 


! It will be seen that the total number classified into each group does not 


correspond with the number known to be in that group. This i i 
: 1 . is an obvi 
difficulty of this method which has received a oe 


(Rao, 1948). Proper solution only recently 


234 THE PSYCHOTIC DIMENSION 


TABLE XLIV 


Percentage Correctly Classified 


M poA Variables Used | 
Total Normal | Neurotic | Psychotic 
Quadratic Canonical variate 713 82 | 64 68 
Linear Canonical variate 68:7 78 | 58 70 
Quadratic Test scores 72:7 80 | 66 72 
Linear "Test scores 68-7 80 | 68 | 68 


Predicted group membershi 
canonical variate scores. 


ү, 


р оп the basis of the quadratic likelihood 


52 


58 
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A 
GROUP MEAN 


64 
NEUROTIC 
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GROUP MEAN 


70 


76 


NORMAL 
© 
GROUP MEAN 


41 47 53 
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the lowest degree of misclassification. The resemblance of the 
positions of the three groups to those of the three groups in Figures 
29, 30, and 31 will be noted. It will also be noted that the position 
of the psychotic and normal means, which are parallel to the Y, 
axis, justifies our interpretation of this canonical variate as 
‘psychoticism’, while the fact that Y, discriminates the neurotic 
mean from the two other means may serve as justification of our 
interpreting this canonical variate as one of ‘neuroticism’. 

Not too much importance is attributed to the actual discrimin- 
ation achieved. It is well known that statistical combinations of 
several scores in order to give maximum correlation with a 
criterion tend to give spuriously high values because of the utiliz- 
ation of chance errors which happen to correlate with the predicted 
variables. The fact that the zero-order correlation ratios are rela- 
tively high—being equal to -50, :57, :56, and -51 respectively for 
tests M, N, O, and P—makes it unlikely that the multiple R would 
drop very much on repetition of the experiment (two such repeti- 
tions are under weigh at the moment). Also it is clear that a more 
careful selection of subjects, and a more varied selection of tests, 
would result in substantial improvement on the figures given here. 
However, our main purpose in quoting this study lay not in the 
demonstration of the possibilities for differential diagnosis of objec- 
tive tests (welcome as this proof may be); it lay in the demonstra- 
tion that two dimensions are required in order to account for the 
distribution of scores made by normal, psychotic, and neurotic 
subjects. This conclusion agrees with that derived from our 
factorial studies, and adds the important element of statistical 
significance; where previously our results had to be evaluated 
more or less on an impressionistic basis, they can now be stated in 
terms of exact P values. 


Chapter Seven 


APPLICATIONS OF DIMENSIONAL ANALYSIS 


N this chapter a number of investigations are described in which 
[= attempt has been made to apply the dimensional approach 
to various practical problems. Included are studies into the 
after-effects of prefrontal leucotomy, into the employability of 
mental defectives, into work adjustment and productivity of un- 


skilled factory workers and into the selection of students and nurses. 


‹ 5 and important of the series of 


pter is the investigation of the after 
effects of leucotomy, carried out by A. Petrie (1952). Its import- 


ance lies not only in the practical necessity of knowing just what 
sort of personality change may be expected to occur after the fibres 
connecting the frontal lobes with the thalamus have been sec- 


, this study illustrates very well a point on which 
stress has been laid before, namely, 


use of the hypothetico-deductive m 
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gation. Too much work has been done on the after-effects of 
leucotomy in a spirit of blind empiricism, of trying any test that 
came to hand that might be relevant; the almost completely 
negative results of this approach will be familiar to most readers, 
and are catalogued in Crown’s very able survey of the field (1951). 
What is required, surely, is first of all an hypothesis, couched in 
operational terms, as to what after-effects would be expected, 
followed by an experimental design which permits unambiguous 
Verification or refutation of deductions made from this theory. 
Even if unsuccessful, such a procedure would increase our know- 
ledge far more than any number of ad hoc empirical tests based on 
no clear underlying hypothesis. 

The results of the almost universal failure to make use of this 
method have been rather melancholy. Bronfenbrenner (1951) has 
stated the position admirably: ‘It is difficult for the would-be 
theorist to avoid being forced in one of two dissociated directions. 
If he covets his reputation as a scientist, he is under pressure to 
confine himself to the analysis of relatively simple phenomena 
where the variables are few, discrete, and susceptible to rigorous 
experimental control. The most significant aspects of human 
behaviour, however, are not likely to be found in this category, 
for they are characteristically clusive and multideterminate. To 
wrestle with these at a realistic level and at the same time to face 
up to the expectations and criticisms of fellow-scientists take more 
time, energy, patience, and self-integration than many able men 
command. It is far easier to remain free from such demarids by 
doing one’s theorizing in a non-scientific context. As a result, it is 
perhaps possible to say—with only moderate exaggeration—that 
the study of human behaviour in America shcws a bimodal dis- 
tribution with undisciplined speculation at one mode and rigorous 
Sterility at the other.’ This study will illustrate how it is possible to 
achieve scientific rigor, while avoiding rigor mortis, through the use 
of the hypothetico-deductive method. 

The first two hypotheses advanced by Petrie are illustrated in 
Figure 33. Taking her cue from the isolation of the two personality 
dimensions of neuroticism and introversion-extraversion, she posits 
that (1) neurotic patients after leucotomy will show a decrease in neuroticism, 
and (2) that neurotic patients after leucotomy will show an increase in 
extraversion. An additional hypothesis rel 
dimension, this time in the cognitive 
neurotic patients after leucotomy will show 


ates to another personality 
field, and states that (3) 
a decrease in intelligence. It 
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should be noted that these predictions apply to neurotic patients 
only; the changes to be expected with psychotics will be discussed 
later in connection with the experimental work of Crown (1950). 

The use of such terms as neuroticism, extraversion, and intelligence 
presupposes some sort of operational definition, as otherwise no 
testable deductions from these hypotheses are possible. Six tests, 
taken in the main from Dimensions of Personality, were used to 
measure neuroticism: body-sway suggestibility, manual dexterity, 


EXTRAVERT 


== Сысы шс 
NORMAL NEUROTIC 


ORIGINAL 
POSITION 


INTROVERT 
Figure 33 


greater smoothness of work-curves, less self. 
inferiority feelings on self-rating scales, gre 
writing, and less disposition-rigidity. 
Another set of six tests was used 
accuracy/speed ratio, intropunitiveness 
humour, verbal /non-verbal intelligence 
a preoccupation with the past rather th: 


Criticism and fewer 
ater speed of hand- 


to measure introversion: 
» relative dislike of sexual 
test ratio, persistence, and 
an with the present or the 
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future. The prediction made on the basis of the original hypothesis 
requires the following changes after leucotomy: lower speed/ 
accuracy ratio, decrease in intropunitiveness, increase of sex- 
humour appreciation, lowering in the verbal/non-verbal intel- 
ligence test ratio, lessening in persistence, and a lessening in 
pre-occupation with the past. 

Regarding the changes in intellectual capacity, the Wechsler- 
Bellevue and the Porteus Mazes tests were used, as well as the 
‘Proverbs’ test from the Revised Stanford-Binet test. The direction 
of changes to be expected will be self-evident from the original 
hypothesis. 

Two separate types of operation were performed. The first of 
these is the so-called Posterior Standard Leucotomy, described in 
detail elsewhere (McKissock, 1943). The other type of operation, 
the Anterior Rostral Leucotomy, is an open bilateral operation, 
carried out through a pair of burr holes under direct vision. It is 
more anterior, and therefore presumably less damaging, than the 
Posterior Standard Leucotomy; full details will be found elsewhere 
(McKissock, W., 1949). Two further hypotheses were based on 
the differences between these two types of operation. In view of 
the fact that the Rostral operation is less far-reaching, it was 
posited (4) that the same changes would be found as in the Standard 
operation, but that their extent would be less extreme. 'The other deduction 
is based on the assumption that the great variability in the after- 
effects of the Standard operation may be due to operative vari- 
ability consequent upon lack of visual control, a lack not present 
in the open Rostral operation. It would seem to follow, if this 
reasoning be correct, that (5) there should be less variability in the 
after-effects of the Rostral operation. 

The patients operated upon were severe neurotics, primarily 
those suffering from obsessional symptoms and from anxiety; they 
would in the main fall into the category of ‘dysthymics’ as described 
in Dimensions of Personality. The same principle of selection was 
followed in choosing patients for cither type of operation, and 
there is little reason to suspect the group of patients who were 
given the Standard operation to have been different in any im- 
portant respect from the group of patients who were given the 
Rostral operation: 20 patients in the Standard group were tested 
before the operation, and retested three and nine months after the 
operation; 15 patients in the Rostral group were tested before the 
operation, and retested six months after the operation. Some 
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additional patients in the Standard group only received one retest; 
some additional patients in the Rostral group received only uni- 
lateralincisions. Results from the main groups only are summarized 
here; a full report will be found elsewhere (Petrie, 1952). | 

The results of the tests as applied to the Standard operation 
group of patients are given in Tables XLV, XLVI, XLVII; these 
figures, it should be noted, are merely a brief summary of the many 
lengthy and detailed tables given by Petrie. Regarding the pre- 
dicted decrease in neuroticism, it will be seen that the direction of 
the change for all six tests is as predicted; this is true both for the 
three-months and the six-months retest. Most of the C.R.s are fully 
significant, and for every test there is at least one significant change 
in the predicted direction—either after three or after nine months. 
We may conclude that with respect to this dimension of person- 
ality, the original hypothesis is fully borne out. 


TABLE XLV , 
CHANGES AFTER LEUCOTOMY: CRITICAL RATIOS 


| 3 months | 9 months 
(1) Neuroticism | 
(а) Body sway suggestibility 2:33 2:91 
(0) Manual dexterity 614 3:95 
(c) Smoothness of work curves 2:34 N.S. 
(4) Self-rating scale NS. 2-86 
(e) Tempo of handwriting N.S. 2:43 
(Р) Disposition-rigidity 1:90 212 


With respect to the Introversion-Extra: 
results are equally clear. Ae 
direction, and with one exce 
a high level of confidence. ( 


{ version dimension, the 
ain all tests change in the predicted 
ption all the changes are significant at 


nce. (It would have been permissible in all 
Cases to use the one-tail test. In order to avoid any possibility of 


arbitrary statistical manipulation of data, however, Petrie used the 
two-tail test in connection with computations arising from the 
Standard operation, thus reducing the chances of finding significant 
differences.) 

With respect to intelligence, again the figures are very clear. 
The only test which does not show any declines after the operation 
is the Wechsler Performance scale; the Verbal scale, the Full scale, 
and the Porteus Mazes test all show the predicted decline at a 
highly significant level. Certain points dught to be borne in mind 
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here, however. It is somewhat artificial to divorce the cognitive 
functions from personality as a whole, and the fact that the ratio 
of verbal to non-verbal performance on intelligence tests is a good 
measure of introversion (Himmelweit, 1945) shows the close rela- 
tion which obtains between temperament and intelligence. The 


TABLE XLVI 
CHANGES AFTER LEUCOTOMY: CRITICAL RATIOS 


3 months 9 months 
(2) Extraversion-Introversion 
(a) Speed/Accuracy ratio 1:10 1:75 
(b) Intropunitiveness 2:53 3:34 
(c) Sex humour liking Significant * | Significant * 
(d) Verbal/non-verbal Intelligence test Significant * | Significant * 
(e) Persistence 2:82 2:42 
(f) Preoccupation with past Significant * Significant * 


* Data in original significant but not expressed in terms of C.R. 


differential decline here found—marked with respect to verbal 
intelligence, non-existent with respect to performance—is exactly 
what would have been predicted on the basis of Petrie's second 
hypothesis. The Porteus test also involves personality factors out- 
side the purely cognitive field; according to hypothesis (2) speed 
and accuracy on this test should change after the operation, and 


TABLE XLVII 
CHANGES AFTER LEUCOTOMY: CRITICAL RATIOS 


3 months 9 months 
(3) Intelligence | 
(а) Wechsler Verbal 481 3:09 
(b) Wechsler Performance 0:24 1:04 (improved) 
(c) Wechsler Full Scale 4:02 2:23 
(d) Porteus Mazes 314 1:67 


indeed both changes are observed (C.R.s 2:47 and 3:25 respec- 
tively for the three months’ retest). No figures are given for the 
Binet Proverbs test, but qualitative observation of responses leaves 
little doubt of considerable deterioration in ability to generalize 
These three tables then verify in almost every detail the pre- 


diction made, and suggest very strongly that in this type of patient 
S.5.P. 
R 
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a marked change takes place with respect to his position on the 
three dimensions of personality involved. Much additional informa- 
tion is given in Petrie's book which supports this conclusion, but 
we have no space to discuss it in any greater detail. Instead, we 
must now turn to the changes which follow after the open Rostral 
type of operation. . Ka В 

` "These changes are summarized in Table XLVIII; they are 
in the same direction as those reported in connection with the 
Standard operation, and most of them will be seen to be significant. 
Affecting any direct comparison is the fact that the number of 
patients submitted to the two types of operation is unequal; it 
will be remembered that 20 patients had the Standard operation 
done, while only 15 had the Rostral operation done. When this 
qualification is borne in mind, there remains little doubt that 
the non-cognitive changes observed after the Rostral operation are 
identical in direction, but smaller in extent, compared with the 
changes observed after the Standard operation. Thus hypothesis 
(4) is seen to be verified, and in so far as the results are relevant to 
hypotheses (1) and (2) they must be regarded in the nature of a 
replication of the first experiment, thus establishing our faith in the 
reproducibility of these results on a much firmer basis. 


TABLE XLVIII 


CHANGES AFTER ROSTRAL LEUCOTOMY: CRITICAL RATIOS 
(1) Neuroticism 


(a) Body sway suggestibility 


2:27 
(b) Manual dexterity 2:64 
(c) Smoothness of work curves N.S. 
(4) Self-rating scale 2:93 
(e) Tempo of handwriting 4:67 
(J) Disposition-rigidity r85 

(2) Extraversion-Introversion 
(a) Speed/Accuracy ratio N.S. 
(b) Intropunitiveness 2:07 
(c) Sex humour liking Significant * 
(d) Verbal/Non-verbal Intelligence test N.S. 
(e) Persistence N.S. 
(7) Preoccupation with past 


Significant * 
(3) Intelligence 


(a) Wechsler Verbal 


1:12 
(b) Wechsler Performance 1:65 
(с) Wechsler Full Scale 1:21 
(d) Porteus Mazes 


E 0:33 
* Data in original significant but not expressed in te 


rms of C.R. 
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This confirmation only applies to the non-cognitive changes, 
i.e. to changes along the dimensions of neuroticism and extra- 
version-introversion. Changes in intelligence level do not follow 
the pattern observed after Standard operations; the obvious loss 
found in patients operated on according to the Standard procedure 
is absent in patients operated on according to the Rostral pro- 
cedure. This result argues in favour of a specific location hypothesis 
with respect to certain aspects of intellectual functioning, and may 
thus explain the contradictory reports on intellectual after-effects 
of leucotomy which have characterized the literature (Crown, 
1951). While this finding thus opens up exciting new research 
prospects, it is not intimately connected with our main theme, and 
will not therefore be pursued further. 

One way of illustrating the fact that in the cognitive field the 
effects of the two operations are dissimilar, while in the non- 
cognitive field they are similar, is to calculate correlation coeffi- 
cients between the critical ratios of the differences in test scores 
following on the two operations. Comparing the first Standard 
operation retest with the Rostral operation retest, the correlations 
for cognitive and non-cognitive test C.R.s are respectively — -400 
and + :211; comparing the second Standard operation retest with 
the Rostral operation retest, the correlations are — -143 and 
+ :369 respectively. Thus after the longer period the changes in 
temperament and character had become increasingly similar in 
the two operations, while the dissimilarities between the cognitive 
after-effects on the two operations had decreased. (It is quite 
justifiable to use C.R.s in this fashion, although the number of 
patients in the two groups is different.) 

We may now turn to the fifth hypothesis, which used the obvious 
differences between an open and a closed operation to predict 
greater variability of after-effects in the closed operation. To inves- 
tigate this hypothesis, the S.D.s of the differences in tests on which 
significant changes were found after both types of operation were 
compared. The results favour the hypothesis strongly with respect 
to measures of introversion-extraversion, but are ambiguous with 
Tespect to measures of neuroticism. We can only regard this 
hypothesis as partially confirmed, therefore, and must await a 
repetition of the experiment, possibly with larger numbers of 


1 The number of tests involved in these correlations is 19 and 32 respectively 


for the first retest, and 19 and 41 for the second retest The oth i 
‹ . er tests 
are fully described in Petrie (1952). SANA 
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subjects, before deciding definitely either for or against it. It should 
be remembered, in this connection, that while Meyer (1949, 1950) 
and others have shown that there is considerable variation in the 
incisions made in the Standard operation, it has not yet been 
ascertained that the open Rostral operation does actually lead 
to more precise results. Possibly part of the variation in the cut 
may be due to the elasticity of the brain tissue, and part to hemor- 
rhage and interference with blood supply. This is a complex 
problem, and no final conclusion can be claimed. 

In summary of this research, we may say that a large number 
of highly significant changes has taken place after prefrontal leuco- 
tomy, a finding which in itself contrasts markedly with the usual 
dearth of positive results in this field. These changes are in almost 
every case in the direction predicted by the original hypothesis, 
and the results obtained with the Standard operation are dupli- 
cated by the results obtained with the Rostral operation, although 
in a less marked manner. This decrease in extent of change is itself 
in line with the original hypothesis. V. ariability of effects is some- 
what more marked after the closed than after the open operation, 
but although this result is in line with the original hypothesis it 
cannot be regarded as firmly established. The failure to find any 
change on cognitive tests after the Rostral operation, although 
considerable changes were found after the Standard operation, 
was not predicted, and must await confirmation before it can be 
accepted as a fact. 

These results, it will be remembered, were obtained on neurotic 
subjects, and the predictions were made with reference to neurotic 
subjects only. What hypotheses could one advance in the psychotic 
field? Clearly, changes along the dimensions of neuroticism and 
extraversion-introversion would only be expected to occur in à 
very attenuated form, if at all. Changes in intelligence level cannot 
be predicted in terms of our dimensional system, as cognitive pro- 
cesses have not yet been linked up properly with psychoticism. The 
main prediction, therefore, would be a shift along the psychoticism 
axis, in the direction of greater normality. It would be easy to 
translate this prediction into operational terms by reference to the 
ie poc been shown to define a factor of psychoticism; 
E ү ү у no research 15 available in which these tests have 
deem. о E a some evidence that our prediction with respect 
in thes extraversion-introversion is fulfilled, however, 

own's study of the effects of prefrontal leucotomy on 36 
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psychotic patients, retested three months after the original opera- 
tion (1951). 

Relevant results include a slight tendency towards lower in- 
telligence test scores. ‘Post-operative changes on tests of speed- 
accuracy, persistence, level of aspiration, and on the Rorschach 
“experience type” gave no consistent support to the hypothesis 
that a psychotic group becomes more extraverted after leucotomy.’ 
"There were no significant changes after leucotomy in the scores of 
tests designed to measure Eysenck’s . . . factor of personality organ- 
ization. The general direction of the changes which occurred 
on tests of suggestibility, static ataxia and on the oscillation test 
suggested an increase in the “normality” of the group.’ 

When one is dealing with definite hypotheses, negative results 
may be as important as positive results. In terms of our conception 
of the psychotic axis as running orthogonal to the neuroticism and 
extraversion-introversion axes, it is quite easy to understand why 
results found on a group of neurotic patients cannot be duplicated 
on a group of psychotic patients; indeed, any other result would 
have thrown doubt on the general theory. Similarly, we may add 
to our positive forecast regarding the shift of psychotic patients 
on the psychoticism axis a negative one, viz. that no such shift 
would take place in neurotic patients. Both predictions are easy to 
test, and refutation or verification should only be a matter of time. 


(2) The Employability of Mental Defectives 


The work here described was undertaken to study the employ- 
ability of mental defectives; this severely practical aim has theo- 
retical implications of great importance, however, which were 
clearly brought out by the two investigators, J. Tizard and N. 
O'Connor, in a series of papers describing their investigation 
(1950a, 19505, 1950с, 1951, Tizard, 1951, O'Connor, 1951). The 
present account is concerned only with that section of their data 
Which is relevant to our general hypothesis of a ‘neuroticism’ 
factor; for more detailed descriptions the reader is referred to the 
papers quoted above. 

It is widely believed that mental defectives, certified as such 
and segregated in special institutions, are characterized by LQ.s 
well below the somewhat arbitary but widely accepted limit of 70 
originally suggested by Terman. Some writers have suggested 
lower limits than this, and most authorities would consider that 
factors other than the Г.О. should be considered in certification; 
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however, it is widely agreed that although individuals having І.О, 
of over 70 may occasionally be certified, only very exceptional 
circumstances would justify such a step. This general picture is not 
in agreement with results obtained on 104 high-grade institutional- 
ized male mental defectives, who were given 5 tests of general 
intelligence. The tests used, together with the mean I.Q.s obtained 
by the group, are given in Table XLIX; it will be seen quite 
clearly that the average level of ability of this group is represented 
by an LO. level of about 75. Thus the average score of this group 
is well above what one would normally have expected to be the 
maximum score. Indeed, scores as high as LO. 120 and above 
were recorded by isolated individuals on the Koh's Block Design 
test, the Binet Vocabulary test, and the Porteus Mazes test! 


TABLE XLIX 


Test Mean Г.О. 


Koh's Block Design, Alexander Version 


" 
Progressive Matrices, 1938, untimed 5 
Binet Vocabulary Test zi 
Porteus Mazes Test, Vineland Revision 83 
Cattell Non-Verbal Test, Form I.B.* 73 


* S.D. adjusted to 16 from 25 as given by Cattell, so as to 
make the results comparable with other tests. 


Several different interpretations are possible of these data. 
Certification may be at fault, in not giving sufficient weight to test 
results, and particularly by relying on one single test (Binet vocabu- 
lary) to the exclusion of other tests less subject to environmental 
influences. Indeed, it might plausibly be argued that the assess- 
ment of an individual's intelligence should never be undertaken on 
the basis of a single test, but that a battery of tests administered 
and interpreted by an expert in diagnostic testing. constitutes the 
only safe basis for such an assessment. Even if we assume that the 
original testing was carried out competently, and that the average 
LQ. of our group at that time was around the 50 level, the statis- 
tical fact of regression (assuming a correlation of about EN between 
original test and retest—a figure which is more likely to be an 


overestimate than an underestimate, in view of the low observed 


intercorrealtions of the five tests i 
at sts in Table XLIX) would have 


ge to rise considerably in the i i 
i та years intervening 
between the original and the final testing. Disregard of the facts of 
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regression in assessing intellectual status and development is un- 
doubtedly a potent cause of aberrant I.Q. values in samples of 
high-grade mental defectives. 

However, in addition to these two possible explanations of the 
rather puzzling facts reported by Tizard, we must consider the 
possibility that certification is carried out on the basis, not merely 
of intellectual defect, but of a combination of mental defect and 
neuroticism. The child that is merely dull, with an LO. of 60 or 
65, may easily escape certification; the child that is less dull, but is 
also suffering from emotional instability, is far more likely to be 
found an unbearable nuisance by Society, and to be certified as a 
mental defective. On this hypothesis, ‘mental deficiency’ of high- 
grade defectives would be regarded as a combination of intellectual 
deficit and emotional instability, weighted differently by different 
certifying officers. High emotional stability could then counter- 
balance a low LQ,., while low stability would offset even a 
moderately high І.О. 

On the basis of these considerations one would be able to make 
two predictions regarding the employability of high-grade defec- 
tives. The first prediction would be that they would be rather 
variable with respect to neuroticism, the second that neuroticism 
would play a more important part than intelligence in their success 
or failure to adjust successfully to their employment. One further 
prediction may be made, namely that neuroticism in defectives can 
be measured objectively with the same tests used with normals. 
These three predictions, or hypotheses, form an interlocking system 
which can easily be tested empirically. 

The first hypothesis to be tested is clearly the one related to the 
applicability to defectives of the tests of neuroticism found to be 
valid in normals. The subjects used in this study were 104 high- 
grade mental defectives; they were consecutive admissions to 
Darenth Park, with cases with physical handicaps, or with I.Q.s 
below 50 excluded. Mean age was 21 years, with an S.D. of 46. 
The following tests of neuroticism were used: (1) Body Sway test of 
Suggestibility; (2) Heath Rail Walking test; (3) Manual Dexterity 
(U.S.E.S. tests M and N); (4) Finger Dexterity (U.S.E.S. tests O 
and P); (5) Speed on the Track Tracer test. Also included was a 
measure of intelligence, namely the Matrices test. In addition to 
these tests, à four-point rating of stability was used, based on the 
social behaviour of the defectives tested. Each subj 


1 he d €ct was rated by 
three judges, a psychiatrist and two psychologists, 


who were equally 
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familiar with the subjects. Intercorrelations of judges’ ratings 
averaged :6; a combined rating was obtained in which complete 
agreement of the three judges was required. The four points of the 
rating scale were defined in detail (O’Connor, 1951); briefly they 
were: (1) Markedly stable, (2) Stable but immature, (3) Rather 
unstable, (4) Markedly unstable. Numbers of defectives in these 
four categories were respectively 45, 15, 20, and 15. 
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cism to the extent of -46. This is slightly higher than the value 
given for the correlation between intelligence and neuroticism in 
normal groups, but the difference is not significant, and may be 
due to sampling errors. The second factor is not to be regarded as 
a pure factor, but appears to be loaded largely with tests requiring 
muscular co-ordination—rail walking, manual and finger dexterity, 
speed test. The possibility that such a factor may exist cannot of 
course be gainsaid; it is merely doubted whether it can justifiably 
be postulated on the basis of these results. It is of interest, however, 
that suggestibility has no loading on this factor; thus it does not 
appear that lack of muscular co-ordination is responsible for body 
sway in the suggestibility test. 

We may conclude then that our hypothesis regarding the use- 
fulness of tests validated on intellectually normal groups for the 
measurement of neuroticism in defectives may be regarded as 
strongly supported. An interesting exception to this statement is 
the persistence test, which was also applied to the 104 defectives. 


This test was given in two forms, the leg persistence test, in which 


the subject is required to hold out his leg unsupported as long as he 
ich he is required to hold a 
pull for as long as he can. 
d results with intellectually 
neurotic from the normal 
r, discrimination was very 
d with the others in Table L 
analysed the saturations of the 
ight direction, were below the 
of persistence tests not to give 
fectives had already been noted 
1948), and an explanation of this 
cat interest. It illustrates to per- 
ften neglected statement that while we may 
SCH persistence, and intelligence 

Inear regressi i ncc, 
and normal distribution are liable to fail ас E Ger 
h the extremes of the distribution. 
ther hypotheses, viz. those dealing 


€ 104 defectives, their work success, 


mated by a Soci 5 sis 
of the scales shown in Table LI. "E een TNR 
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TABLE LI 
Rating Scale N 
Conspicuous all-round success on daily licence 7 6 
Settled down well in first job in daily licence 6 13 
Satisfactory on daily licence after not more than two 
replacements 5 17 
Settled on daily licence after several trials 4 30 
Failed consistently on daily licence 3 16 
Successful in institution work, but unlikely to be con- 
sidered for licence 2 4 
Unsuccessful in Institution workshop I 8 


This employability criterion was incorporated in Table XLIX, 
and it will be seen from Figure 33 that it had a factor saturation 
for neuroticism of -66, as well as a slight correlation with the motor 
co-ordination factor (-31). It appears, then, that our original hypo- 
thesis is borne out by the results; employability of mental defec- 
tives is forecast with considerable accuracy by neuroticism tests. 
Employability is forecast by both factors in combination to the 
extent of a correlation of 29. Considering the unreliability of the 
criterion (no data are available on this, but a general knowledge 
of the many chance factors entering into a person’s success in 
employment under the general conditions of this experiment sug- 
gests an upper bound to reliability in the neighbourhood of :8) this 
must be regarded as a remarkable success. The conclusion seems 
justified that neuroticism is a very important factor in the employ- 
-ability of defectives. It will be seen from Table L that the influence 
of intelligence on employability is very much less important; the 
correlation between employability and score on the Matrices test 
is only -34, so that only about one-fourth as much predictive power 
is given by intelligence tests as by neuroticism tests. 

The social consequences of these findings are considerable. 
It will be remembered that this group of 104. defectives was not 
selected but constituted successive admissions of physically not 
handicapped persons with I.O.s over 50. Two-thirds of this group 
was suitable for employment in the community, at wages averag- 
ing £4 10s. od., and going up to £7 on occasions. A further 29 per 
cent were suitable for institution employment. Only 8 per cent 
were unemployable custodial cases. Both society and the defectives 
themselves would benefit from the introduction of routine test- 
ing along the lines suggested of all high-grade defectives, and a 
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i olicy of licensing and work-placement. It hardly needs 
bise m disons ШО. all such work should be supervised and 
controlled by fully qualified clinical psychologists, well acquainted 
with the problems, both psychological and statistical, which arise 
in mental testing, and specially trained in the administration of 
objective performance tests of personality. Untold harm could be 
done by the use of psychological tests in unskilled hands; indeed 
itis only too obvious that the indiscriminate use of such tests as the 
Binet by those whose background training has not included a 
considerable amount of mental testing and statistics is doing much 
harm to the whole mental testing movement. How much greater 
could be the harm resulting from the indiscriminate usc of the much 
more complex and difficult tests used in personality measurement! 


(3) Work Adjustment of Unskilled Factory Workers 


Among the fields studied by industrial psychologists, those of 
productivity and work adjustment are recognized to be of para- 
mount importance. Most of the published work has dealt with the 
relation of abilities to those concepts. A pioneer study was carried 
out by Markowe, Heron, and Barker, respectively a psychiatrist, à 
psychologist, and an economist, into the relationship obtaining 
between productivity and work adjustment on the one hand, and 
various temperament and character traits on the other. Their work 
is of particular interest as it extends the type of research described 
in this book to unskilled factory workers, a group not previously 
investigated. It is natural that in the brief summary of their work 
here given, most stress will be laid on the psychometric aspects, 50 
that for the majority of the facts and calculations mentioned we 
are indebted to Alastair Heron, whose own account of the experi- 
ment should be consulted for further details (1951). 

The research was carried out in the main factory of a co-ordin- 
ated group of manufacturing and trading firms in the chemical 
and light engineering industry manufacturing a range of lead-acid 
accumulators. The medical member of the team was the first to 
make direct contact with the firm, and spent several months ‘work- 
ing his way’ round the various factory departments. This pre- 
liminary period of getting acquainted constituted an indispensable 
part of the general approach. Having ensured general co-operation 
from management, and from worker representatives through the 
Works Committee, meetings were held with selected officials, 
including the Chief Shop Steward, to discuss the research project 
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Meetings were also held with the men who constituted the experi- 
mental group, and an outline given of the aims and objects of the 
investigation. 

The experimental group selected consisted of grid-casters in 
the Moulding Department. No selection took place; all eighty-five 
men who constituted the entire population of grid-casters were 
approached. Of these, one refused to co-operate, one was trans- 
ferred to another factory, and three left of their own accord before 
the conclusion of the project. Thus the experimental group con- 
sisted of eighty persons. These were subjected to a variety of 
psychiatric, medical, and psychological procedures. 

(1) Height, weight, vision and hemoglobin measurements were 
made routinely, as were also lumbar pull dynamometer tests and 
the Schneider test of cardio-vascular efficiency. Clinical examina- 
tion of special systems was performed when indicated or requested. 

(2) A medical and psychiatric interview was carried out by the 
psychiatrist, and ratings made on the basis of this interview and 
the measures described under (т) combined for: (а) Past health, 
(6) Recent mental health, and (c) Recent physical health, using 
4-point scales; also rated were thirteen personality traits, on 3-point 
scales. These health and personality ratings were made by the 
psychiatrist quite independently, without the aid of supervisors’ 
ratings, productivity data, or absenteeism records. 

(3) Most of the psychological tests given have already been 
discussed in connection with other researches. They included the 
following tests of intellectual ability: The Dominoes test, which is 
probably the purest ‘g’ test yet devised, the U.S.E.S. Paper Form 
Board, the Mill Hill Vocabulary test, and a letter series test. Tests 
of temperament used were the following: Crown Word Connection 
List, Hand Dynamometer persistence, Cattell’s ‘237’ test of per- 
severation, a ‘Worries’ inventory, the U.S.E.S. Peg and Finger 
Boards, the Track Tracer, Static Ataxia, Leg Persistence, an 
‘Annoyances’ inventory, an ‘Interests’ inventory, the Rees-Eysenck 
Index of Body Build, a ‘Food Dislikes’ inventory, and a Level of 
Aspiration test. An individual Rorschach test was also adminis- 
tered, but results have not been analysed in conjunction with the 
other tests and will not be discussed here. Total time of administra- 
tion was in excess of two hours per subject. 

Having described the predicting variables, we must now turn 
to the predicted variables, i.e. the criteria for this particular 
research. It is well known that in industry good criteria are hard 
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to find, and much research was devoted to finding reliable and 
valid criteria of Work Adjustment and of Productivity. 

(1) Work Adjustment. A panel of six raters was established, 
consisting of the foreman, a former chargehand now in the depart- 
mental office and in daily contact with all the men, and the four 
chargehands, who rotate independently of the three piecework 
shifts worked by the men. Two sets of ratings were obtained from 
the raters, an interval of five weeks elapsing between the two 
ratings. The method used is rather original, and is fully described 
by Heron (1951). 


‘Will you please listen carefully to the explanation which I read 
to each of the six supervisors who are taking part in this section of 
the research project. 

"The object of this exercise is to classify the eighty men whom 
we have seen in the Moulding Department with respect to their 
apparent adjustment to the total demands of the job-situation. 

‘It is important to make it quite clear from the outset, first of 
all, that these classifications will remain strictly confidential to the 
members of the Team; and secondly, that we do NOT want a 
rating which is based solely or mainly on a man’s rate of produc- 
tion as compared with others. 

‘In front of you are three trays; in my hand is a sct of cards, 
each card bearing the name and clock number of one of the eighty 
men who are taking part in the research. 

: Ina aes moments I shall place a large card at the head of each 
ray, and after giving you an opportunity to read what they say> 
and to observe that the two side ones describe extremely different 
people, I shall start handing you one small card at a time. If you 
feel that the man whose name and clock number is on the card is 
trol s em Reo Ue uode vam oe 
ae melee e E that the description is too extr e 
then drop his card in th CE орышу hiai ael enous e 
this first sorting, there jn b iine: marked ‘Average’. On 
than in the two side aie с бше Prat шай 
5 put together. Is that clear?’ 


(Pause. ANSWER ANY QUESTIONS. ) 


SEH LARGE CARDS AT HEAD OF TRAYS. PAUSE TO LET 
JM STUDY THE DESCRIPTIONS. ANSWER ANY QUESTIONS.) 
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‘I shall hand you a fresh card every 5 seconds, like this.’ 
(DEMONSTRATE WITH THREE BLANKS.) 
‘Right, let's begin, and keep up a steady pace.’ 


(When the first sorting is complete, take the cards from the 
“VERY WELL-ADJUSTED' tray and if they exceed TEN in number, ask 
the rater to look through them, and to pick out ‘the best 8 or 10’. 
Place rejects in centre tray; repeat for *BADLY-ADJUSTED' if neces- 
sary, asking him to pick out ‘the worst 8 or 10°. Now place second 
pair of large cards in side trays, take pack from centre tray in hands 
and say:) 


‘Now we are going to sort out the cards of the average men. The 
procedure will be the same as before, except that this time as the 
end trays are noi such extreme descriptions, we can expect to have 
about a quarter of the cards in each of them, and about half in the 
middle tray. Right? 


(REPEAT SORTING PROCEDURE.) 
(When the second sorting is complete, adjust the contents of 
the side trays to about 14-20 cards each if possible, but ро мот 


FORCE AN INGREASE OR DECREASE against the rater’s selection if he 
seems satisfied.) 


MATERIAL ON LARGE CARDS 


‘Rating 
Thoroughly settled down 
i Never have to worry about him 
No trouble to me at all 
Well-adjusted to the job First two 
cards in 
end trays. 
Has never really settled down 
5 Always having to keep him in mind 


One of my headaches 
Badly-adjusted to the job 
Has settled down better than average 
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Seldom need to worry about him 


S Not much trouble to me 
Fairly-well adjusted to the job — 
cards in 
trays. 
Has settled down less well than average GES 
Quite often have to keep him in mind 
4 Is quite some trouble to me 
Not very well adjusted to the job 
3 AVERAGE (Used in 
middle tray 
on both 
_ sortings.) 


_Product-moment correlations were calculated between cach 
pair of raters on each occasion, as well as between the ratings © 
each rater from one occasion to the other. The average inter- 
correlation between the six raters was -483, indicating a validity 
of the average rating of approximately -go (Eysenck, 1939)- The 
average reliability of the raters was "781. A factor analysis of the 
table of intercorrelations was carried out, yielding a single general 
factor which accounted for 50:5 per cent of the variance, and 
leaving no residuals greater than the S.E. of a zero correlation. 
'The total variance may therefore be apportioned as follows: Com- 
munality — 50:5 per cent, Specificity — 27-6 per cent Unreli- 
ability. = 21:9 per cent. In view of the prominence of one general 
factor in these ratings, the twelve ratings for each individual rated 
were combined into a single average rating, which constitutes the 
Work Adjustment Rating defined as the measure of concern cause 
to the supervisors by the worker. 

(2) Productivity. A performance index was used, in which the 
number of grids produced, multiplied by the agreed rate of pro- 
duction expressed as a number of ‘standard minutes’ per hundred, 
is divided by the actual clock minutes worked Keesen the 
value of a worker’s time in terms of his actual production. Various 
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calculations showed that this index, which had been in use for a 
long time in connection with a wage incentive system, could be 
regarded as a reasonable one, within the limits set to high indi- 
vidual production rates by the general feeling of solidarity in the 
workshop. As is well known, fear of ‘overproduction’ resulting in 
unemployment, opposition to high taxation, which affects high 
earnings due to high production, and tensions created by out- 
producing less able members of the group all tend to impose a level 
of uniformity on production which keeps it below the maximum 
possible without undue exertion. All variations in productivity 
are therefore considerably attentuated, and much smaller than 
they would be if each individual were working ‘full out’. The 
difficulties introduced into the analysis by these extraneous factors 
are likely to reduce any correlations which might be found between 
productivity and other factors almost to vanishing point. 

Product-moment intercorrelations were calculated between all 
the variables. In view of the fact that age correlated significantly 
with several of the variables, partial correlations were calculated 
and the resulting table. of correlations (with age partialled out) 
submitted to a factorial analysis. The results ofthe centroid analysis 
are given in Table LII, together with the rotated solution. To 
facilitate interpretation, a diagram has been prepared of the 
position of the various tests on the plane formed by factors I and II 
(Figure 35). It will be seen that factor I is characterized by the 
four intelligence tests (Dominoes, Letter Series, Vocabulary, and 
Paper Form Board), and by two speeds of writing tests, which in a 
group of this composition are known to correlate quite highly with 
intelligence. No other test has a high positive correlation with this 
factor, which is clearly identifiable with intelligence.? 

The second factor is characterized by all those tests which had 
been found previously to be good measures of neuroticism: many 
worries, many annoyances, high static ataxia, poor persistence (on 
both tests), many food aversions, neurotic score on Word Con- 
nection List, poor mental health rating. Poor job adjustment also 
has a projection on this factor, which encourages us to interpret it 
with a fair measure of confidence as neuroticism. It should be 


1 The rotated solution shown here differs slightly from that given by Heron 
himself. This difference, however, is not material to the argument. 

? 'The Word Connection List, a measure of neuroticism, has a high negative 
saturation in intelligence in this analysis. Previous work has not disclosed any 
reason for this surprisingly high correlation. 
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pointed out, however, that there are a few anomalous findings 
which contradict previous work, and which can at the moment 
only be ascribed to the fact that this group is very different in 
composition and background to any of those hitherto studied. Thus 


INTELLIGENCE 


DOMINOES. 


LETTER SERIES 


WRITING 2577 


VOCABULARY © PAPER FORM BOARD 


52 


OFINGER DEXTERITY 


MENTAL HEALTH RATING 
е WORRIES 


"NEUROTICISM^ 


ғооо POOR JOB eSTATIC ATAXIA 
TMENT 
AVERSIONS, ADJUSTMENT Au, OYANCES 


*NON-PERSISTENCE ( GRIP) 


DH © INTERESTS 
NON- PERSISTENCE 
(186) 


© WORD CONNECTION LIST 


Figure 35.—Note: some items near origin omitted 


the item ‘Many interests’ has a positive correlation with neuroti- 
cism; eurymorphic body-build, and good finger dexterity, also 
show positive correlations with neuroticism, although at such a 
low level that sampling errors might be responsible for these find- 
ings. However, it would not be wise to dismiss findings contrary to 
hypothesis in any cavalier manner, and it may even be one of the 
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: i i d to 
Р ays of developing our knowledge in this field 
ieee results and attempt to find an explanation 
SS Ca third factor is much more difficult to identify, € 
ne pole is characterized by low persistence, 4 
Шемер high score on the Word Connection List, and spee 
fogd алена, d that possibly we may here be dealing with the 
= е factor, in somewhat attenuated form. 
cepa mentioned characterize the prototype of the SINE 
the hysteric, as opposed to the prototype of the a db 
dysthymic, and although not much reliance can be place : Ae 
identification it is well in line with previous work, althoug bs 
again by no n: all the individual tests can be said to соп! 
estion made. 
ш Sector four is rather easier to identify; it is characterized essen- 
tially by manual and finger dexterity, speed in carrying out Я 
co-ordination test, strong grip, speed in writing, and other na 
indicative of speed and co-ordination. In a job depending so pna 
on motor co-ordination, it is perhaps small wonder that ‘poor d 
adjustment’ is negatively correlated with this factor, and ‘hig 
productivity’ positively. af 
Certain interesting features of Table LII may be worthy 
comment. It will be seen that the item: ‘Psychiatric rating "à 
mental health’ is almost completely uncorrelated with any ger 
item, and has only negligible factor saturations, Its saturation © t 
the neuroticism factor is probably too small to be significant, bu 
in so far as attention may be paid to it its direction will be seen га 
be contrary to expectation: individuals with a ‘good’ rating О И 
mental health will be found to have high neuroticism scores on t 1 
whole, and vice versa. This somewhat surprising result is uio T 
to be due to deficiencies in the examining psychiatrist; it may wil a 
some confidence be said to be the result of the very unusual situe 
tion in which the ratings were made. In the normal clinical a 
tion, the psychiatrist is sought out by the patient, who readilY 


confides in him. In addition, the psychiatrist has available 5 
concerning ће work history of the patient, and other informatio 
which is helpful to him. 


In the experiment under discussion, t n 
psychiatrist requests the subject to see him, thus arousing susplc e 
and perhaps hostility in a working-class population not entirety 
oblivious of periods of unemployment and lay-offs, when p 
personal deficiency was something to be hidden from the employ® 
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or what might be considered his representatives. Also, records of 
past work-history and the like would of course have contaminated 
the psychiatrist’s rating, which was to be correlated with items of 
this very kind, and consequently no information on these points 
was made available to him. It is not surprising that under these 
conditions, so different from those under which he is accustomed 
to exercise his skill, the psychiatrist’s rating contributed little to 
the prediction of either criterion, or to the measurement of any 
factor. 

The supervisor’s rating of ‘poor job adjustment’, on the other 
hand, shows high saturations on the neuroticism factor (in the 
expected direction), and also on the motor speed factor. Produc- 
tivity shows high saturations only with the motor speed factor. 
It is interesting that intelligence has no saturations of any size 
on either of the two criteria; it is perhaps less surprising that 
introversion-extraversion should be neutral with respect to work 
adjustment and efficiency. 

The data quoted above show that (1) personality tests of the 
type advocated in this book can be given effectively in an industrial 
situation; (2) they show patterns of personality organization in 
unskilled workers very similar to the patterns shown by other 
groups hiterto studied; (3) they are applicable in conditions which 
severely restrict the proper functioning of psychiatrists; (4) they 
have acertain validity in predicting work adjustment and efficiency, 
even under conditions which make the assessment of these two 
criteria rather difficult. 

In support of points (1) and (2), another study may be men- 
tioned which was carried out on yet another group of subjects 
differing in many important respects from those studied previously. 
As part of an enquiry into miners’ rheumatism, Miss Braithwait 
administered the following five tests of neuroticism to a group of 
colliery workers, consisting of 84 miners, 45 maintenance workers 
and 45 office-workers: (1) Word Connection List; (2) Maudsley 
Medical Questionnaire; (3) Leg persistence test; (4) Annoyances 
test; (5) Finger dexterity test. These tests were intercorrelated by 

lastair Heron, and a factorial analysis performed which resulted 
in one general factor. The data are given in Table LIII; it will be 
seen that all the saturations are positive and in the expected direc- 
tion; the saturation of the persistence test is as high as +519, that of 
the Maudsley Medical Questionnaire -419, while the Annoyances 
test has the lowest discriminatory power (225). Multiple R, indi- 
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cating the accuracy with which these five tests, dirabi egbe 
would measure the general factor they define, is ‘686; it must 
realized, of course, that this value is subject to shrinkage due to the 
well-known tendency of R to capitalize on chance errors of measure- 
ment. Nevertheless, the results strongly support the view that the 
tests used measure one underlying factor of neuroticism, even in 
a group of subjects very unusual in psychological enquirics. 


TABLE LIII 
N = 174 S.E.zero r = -076 


hen 
r +128 | +-086 +7040 | +-148 333 III 
1 | Residuals —7012 | —-087 | —-035 | +-064 
+176 
T +221 | 4:199 | +013 | 419 17 
2 | Residuals 


т 
Residuals 


ý 
Residuals 


Key: 1. Word Connection List 

2. Maudsley Medical Questionnaire 
3. Leg Persistence Poor 

4. Many Annoyances 

5. Poor Finger Dexterity 


Каомз.рдстов = -686 
(4) Student and Nurses? Selection 
In a review of the literature on student selection by means of 


psychological tests, Eysenck (1947c 
start which had bee Ger 


intelligence. Some experimental efforts have been made to investi- 
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gate this hypothesis, and although results obtained so far are less 
direct and less convincing than those related to other hypotheses 
discussed in this chapter, they are presented here for the sake of 
completeness. 

The first study to be mentioned was carried out by Petrie 
(1948) on 49 male and 8 female students at a Medical School. The 
necessity of using group tests eliminated ab initio most of those tests 
which in the past had been found to be good measures of neuroti- 
cism, and left three measures which with some misgivings were 
applied to the students. These three measures were: (1) Word 
Connection List, (2) Maudsley Medical Questionnaire, and (3) 
the Index of Inaccuracy. This index, originally used as a measure 
of neuroticism by Himmelweit (1950), was calculated by dividing 
the number of false answers in all the cognitive tests by the total 
number of answers attempted, and was hypothesized to be cor- 
related with neuroticism in terms of the theory that emotional 
maladjustment interferes with smooth mental functioning. Cor- 
relations between the Index of Accuracy and the two other tests 
were significantly positive for the students investigated (r = -43 
and :40 respectively), and we may therefore regard the hypothesis 
that the Index of Inaccuracy is a measure of neuroticism as sup- 
ported. In addition to these three neuroticism tests, several cogni- 
tive tests were given, including tests of Vocabulary, classification, 
rote memory, sentence completion, fluency, decoding, and form 
perception. (The fluency test might of course also be regarded as a 
personality test, in view of the fact that low fluency has been shown 
to correlate with neuroticism (Eysenck, 1947) and other traits.) 

To obtain a criterion superior to the usual one of examination 
results, two independent judges were chosen by the Dean, who 
rated the students on a five-point scale with respect to their medical 
ability. The combined rating was found to have a reliability of -8o. 
Correlations of the neuroticism measures with the rating were all 
in the expected direction, but disappointingly low. For the two 
verbal tests (Maudsley Medical Questionnaire and Word Con- 
nection List) correlations were — :16 and — :08 respectively; for 
the Index of Inaccuracy the correlation was — :27, and for the 
Fluency test it was :29. Except for the latter two tests, the correla- 
tions are too low to be of any practical use. A multiple R of 63 was 
found from the battery as a whole, excluding sentence completion 
and Word Connection List; this R is of course a slight overestimate 
in view of the well-known fact that multiple R tends to capitalize 
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on chance errors. However, later work done with these ape 
hat the correlations observed in this sample are typic: 
за s as well, and that a predictive accuracy of between 
i. Ed easily be reached. Given the present selection ae 
tae in medical schools, it can be shown that by the н аа 
tests the failure rate could easily Бе reduced from its presen ge 
of 18 per cent to between 1 and 2 per cent. Similarly, the nu мед 
of students rated ‘very good’ could be increased from the pres 
level of 10 per cent to between 30 and 4o per cent. While n 
cognitive tests did not contribute to any considerable extent to e 
prediction, the discovery that the Index of Inaccuracy is a measu : 
of neuroticism must be counted as a valuable addition to ou 
knowledge of the operational meaning of this term. b 
Another attempt to use non-cognitive tests was made у 
Himmelweit (1950, 1951), though on а rather larger scale. This 
study is of particular interest because it contains a comparison in 
predictive accuracy between several different types of ipud S 
such as interviews, cognitive tests, temperament tests, biographica 
information, and achievement tests. In scope and thoroughness, 
this study is without doubt the outstanding British contribution to 
the experimental investigation of student selection. : 
Students tested were first of all submitted to the usual admis- 
sion procedure, which consisted of an interview, a general essay 
paper, and a paper on paraphrase and précis. Interviews were con- 


ducted by a number of Interviewing Boards, consisting of members 


of the Academic Staff. According to the memorandum circulated 
to the Boards the objectives 


of the interview were as follows: 
(1) "The interview is to be considered as an independent factor 
to be subsequently correlated with the candidate's paper qualifica- 
tions and Entrance Examination results. Nevertheless, it is чен 
able to obtain 3 grasp of the candidate’s paper qualifications an 
background before the interview, so as to be able to relate the 
form and course of the interview to the candidate's particular 
Circumstances, N 
(2) The main object of the interview is to assess the candidate’s 


suitability to pursue a course of study аё. . , special consideration 
being given to the following factors: 


(a) General intelligence, 

(b) Previous education, training and experience. 
(c) Interests and motivation. 

(d) Personality and character, 
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(3) ‘Interviewing Boards are asked to make a single general 
assessment of the candidate's claims, not to sum up the results of 
separate assessments under each of the above headings.’ 

The Boards were asked to make the overall assessments on a 
nine-point scale. 

Students were tested with a battery of cognitive tests, similar to 
that used by Petrie, but more extensive. They were also given a 
set of general knowledge tests, and tests measuring their aptitude 
in reading Tables, Charts, and Graphs. A detailed biographical 
questionnaire was also completed. Last, the following tests of non- 
cognitive personality qualities were given: (1) Shipley Inventory, 
Form C. (2) Ranking Rorschach test, an adaptation of the Har- 
rower-Erickson Multiple Choice Group Rorschach described by 
Eysenck (1947). (3) Level of Aspiration test, in connection with a 
cancellation test. (4) Level of Aspiration test in connection with a 
decoding test. (Tests 3 and 4 are attempts to make use of the level 
of aspiration technique in a group situation.) (5) Speed test— 
decoding. (6) Speed test—cancellation. (7) Index of Inaccuracy. 
(8) Introversion test. (This consisted of the difference in standard 
scores between the vocabulary and the paper formboard test, a 
score which had been shown previously to be diagnostic of intro- 
version (Himmelweit, 1945).) 

The criterion against which tests, interviews, and other data 
were validated consisted of the Intermediate and the Finals exam- 
inations. An attempt was made to assess the reliability of these 
examinations, using Guilford's equation 205 (1936) in order to 
obtain a minimum estimate of the reliability of the constituent 
part measures, and using the Kuder-Richardson Formula (1937) 
to obtain a minimum reliability of their sum. Average estimates of 
examination reliability are -74 for the Final examination, and Bo 
for the Intermediate. While this is a minimum estimate, an applica- 
tion of Eysenck’s formula for estimating the true reliability of the 
ratings of judges (1939) gave figures very similar to the above, 
which may therefore be accepted as being reasonable estimates of 
the true reliabilities. The papers in intermediate and final examina- 
tions were correlated for a set of 245 students, and the resulting 
matrix factor-analysed. A general factor, accounting for 37 per 
cent of the total variance, was first extracted, followed by a bipolar 
factor, accounting for 6 per cent of the variance, which divided the 
Papers into two sets—those given in the intermediate, and those 
Biven in the final examination. It is clear that what is common to 
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the two examinations is more than six times as important as what 
i i each. 
И m patur suggests the possibility of correlating the two 
examinations themselves in order to find a maximum possible pre- 
diction—it is very unlikely that a test can predict an examination 
better than another examination very similar to the one serving as 
criterion. This correlation turned out to be very асле Конт 
of бо. The lowness of this figure reflects in part the unreliability o 
the two examinations; when the observed correlation is corrected 
for attenuation, the corrected value approaches -80, which is a 
better estimate of the true correlation between the two examina- 
tions, assuming both to be perfectly reliable. 
Test prediction of examination results is inevitably lower than 
the observed inter-examination correlation, and amounts to :46 
for the intermediate examination and - 55 for the final examination 
(n = 232 for the former, and 114 for the latter).! Correction for 
lack of reliability in the criterion would raise both correlations 
approximately to the -6 level, but of course this figure is of purely 
theoretical interest. Of more interest is the finding that, using the 
obtained correlations only as indicators of predictive success, it can 
be shown that the use of tests as selection criteria would reduce the 


number of failures from over 20 per cent to something like 5 per 


cent, under the selection conditions actually obtaining at the same 
time. 


Having noted the general results of the testing procedure, WC 
may now return to the non- 


cognitive tests. Some of these, like the 
Inventory and the Group Rorschach, failed to give significant 
correlations. The two level of aspiration tests gave results in the 
expected direction, i.c. Showing greater failure rates among those 
having high aspiration scores; correlations for the Aspirations test 
(cancellation) and Aspiration test (decoding) were — -20 ап 
— "11 (Final and Intermediate examination respectively), and 
— '15 and — -то. The Index of Inaccuracy again gave the best 
result, showing correlations of -51 and :26 with the two examina- 
tions. While because of its mode of derivation this Index correlates 
with the tests of ability from which it is derived, these correlations 
are too low to account for more than a small proportion of its 


ession weights derived from this sample of amden 
bat ¢ a correlation of -42 with final success, thus showing t 
usual decline in predictive accuracy associated with multiple R when applie 
to new groups. 
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predictive variance. The speed score on the decoding test gave 
significant prediction, while the speed score on the cancellation 
test failed to do so; all these correlations are too low to be of any 
interest. 

As would have been expected from the general hypothesis link- 
ing Inaccuracy and high Level of Aspiration scores with neuroti- 
cism, these tests intercorrelate positively. The two Level of 
Aspiration tests intercorrelate -47; Inaccuracy correlates 20 and 
‘og with the other two tests. While these correlations are very 
small, they are derived from tests which are themselves very 
unreliable, and if individual tests had been given higher correla- 
tions would almost certainly have been found. 

We may now turn to a comparison of the test results with the 
interview results. In the first place, correlations of interview ratings 
with all tests of intelligence are insignificant; indeed, the majority 
are negative. If the tests are accurate measures of intelligence, the 
interview would appear to tend to give preference to the duller, 
rather than to the brighter students. In the second place. the 
interview shows negligible correlations with the other Entrance 
Examinations (r = — :02 and -10). Thirdly, the interview pre- 
dicts examination success to the completely insignificant extent of 
:07. It can be seen that although the non-cognitive tests used in 
this investigation did not predict success with anything like the 
accuracy one might have hoped for, nevertheless they were 
markedly superior to interviewing procedures of the kind des- 
cribed. This failure of the interview is only one of many instances 
showing the impossibility of achieving reliable and valid predic- 
tion on the basis of subjective ratings, personal impressions, and 
clinical insight; many other examples have been discussed in an 
earlier chapter. However rudimentary our objective methods of 
personality investigation, and however difficult their application 
in the group situation—even now they can be said to give results 
superior in reliability and validity to the interview. 

It should be borne in mind, in assessing the contribution of the 
interview, that the interviewing board had at its disposal the paper 
qualifications and background of the candidate. American experi- 
ence has shown that these data alone would normally give pre- 
dictions considerably higher than those made by the Board. It 
would appear, therefore; that the contribution of the interview 
may well be negative in sign, as well as small in extent. 

If temperamental factors play a part in the achievement of 
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University students, it would appear even more likely EN Sa 
factors would influence the work of nurses and others w e ig з 
constant contact with people of one kind or another, and w 
academic achievement is only in part related to their iin аи : 
proficiency. An experiment to test this hypothesis was carrie Ge 
by Petrie and Powell (1950), who tested 126 nurses at one of t 
well-known London hospitals. Petrie paid particular attention el 
the question of a suitable criterion, as the choice of a criterion is 0 
the utmost importance in all prediction testing. : Е 

А special rating scale was devised, based on available Englis 1 
and American scales intended for nursing and allied professions. It 
asked for a 5-point rating on each of 18 personality and ability 
traits. The rating was carried out after the nurses had been in 
training in the hospital for at least eighteen months. Each nurse 
was rated by three independent judges who knew her well. The 
average intercorrelation between these three judges (matron, ward 
sister, and sister tutor) was -649. A total rating was derived for each 
nurse, consisting of the sum of the ratings given by the three judges 
on the eighteen traits. 

The ratings on the eighteen traits 


were intercorrelated for the 126 nurses, and a factorial analysis 
was carried out. This gave evidence of two factors, accounting for 
55 and 12 per cent respectively of the variance. The first factor has 
positive saturations throughout, and might best be described as a 
general factor of nursing efficiency. The second factor is bipolar 
and contrasts the following sets of traits: 


(1) Knowledge of underlying principles of nursing practice and 
nursing skills. 

(2) Ability to adapt these to the indivi 
Imagination; foresight; ability 
situations. 


(3) Ability to rise to the occasion in emergency. Initiative; 
resource; ability to stand up to difficult situations. 


(4) Ability to plan, organize and time duties successfully. 
Management of own work and others. 


These four traits are clearly characterized by their close rela- 
tionship to intellectual capaci 


(averaged for the three judges) 


dual needs of the patient. 
to anticipate requirements of new 


human beings. These traits are: 
(1) Co-operation with other members of the ward team. Influ- 
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ence on associates and juniors; relationship with other members of 
the Staff. 

(2) Satisfactory relationship with the patients. Ability to 
gain their co-operation. Patience; understanding; kindness and 
sympathy. 

(3) Satisfactory attitude to patients' families and visitors to the 
ward. Thoughtfulness; kindness and tact; courtesy. 

(4) Loyalty to the standards of the Training School and the 
Hospital. Co-operation with authority; ability to accept criticism. 

It appears, therefore, that our hypothesis regarding two quite 
different types of demand being made of a good nurse is justified. 
One demand is that she should have enough mental ability to cope 
with her work; the other that she should have the kind of tempera- 
ment that allows her to make the maximum use of her ability in 
this profession. The two sets of ratings listed above as having the 
highest positive and negative saturations with the bi-polar factor 
were summed fór each nurse, and the totals, representing respec- 
tively intellectual and social relationship traits, were correlated 
with the tests administered to the nurses. 

Correlations of the tests used with the /o/al rating criterion are 
given in Table LIV. It will be seen that for the cognitive as well as 


TABLE LIV 
List of Tests and their Correlation with the Criterion: 
1. Accurate clerical observation. Minnesota Clerical Test. Number 


of mistakes in comparison of figures. +325 * 
2. Accurate clerical observation. Minnesota Clerical Test. Number 
of mistakes in comparison of names. :309 * 
3. Manual dexterity. Average score on O'Connor Tweezer Test. +289 * 
4. Persistence at a task. Productivity in word building test. :263 * 
5. Persistence at a task. Time spent on word building test. +156 
6. Kent-Shakow Performance Intelligence Test. Intelligence score. "173 * 
7. Kent-Shakow Performance Intelligence Test. Number of mis- 
taken moves. —'240 * 
8. Non-verbal Intelligence. Penrose Pattern Perception Test. “209 * 
9. Maudsley Word Association Test. ‘Neurotic’ score. —+193 * 
10. Adaptation of Interest Test Pressey X-O. 178 * 
11. Adaptation of Annoyance Test Pressey X-O. —:192 
12. Verbal Intelligence. Mill Hill Vocabulary Test. 128 
13. Speed Accuracy Preference. Number of mistakes in trial on 
track tracer when accuracy is stressed. —'175 * 
14. Concentration Test. 152 
15. Distractibility Test. Number of mistaken answers. 41921 
16. Strength of maximum grip on the Dynamometer. “142 


* Starred correlations are significant at the 5 per cent level. Most of the 
others are in the predicted direction and would be significant if the one tail 
test were applied, 
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for the non-cognitive tests all the correlations are in the expected 
direction, and that most of them are significant. A multiple R of 
above :6 was obtained from a combination of 12 of these tests. 
Correlations of selected tests with the ‘intellectual ability’ and the 
‘personal relationship’ criteria are given in Table LV. It can be 
seen that some of the tests are more closely related to the ‘ability’ 
criterion, others to the ‘personal relationships’ criterion. For 
example, both the non-verbal and the verbal tests of intelligence 
are closely related to the ratings on ‘ability’, but not at all to those 
on ‘personal relationships’. The score on the Word Connection 
List, which is a neuroticism test, is more highly correlated with the 
‘human relationships’ criterion than with the ‘ability’ one. On the 


TABLE LV 


CORRELATIONS OF INTELLECTUAL ‘ABILITY’ AND ‘PERSONAL 
RELATIONSHIP’ TRAITS WITH SOME OF THE TESTS USED 


Intellectual Personal 3 
Ability Relationship 
Traits Traits 


` T^ 
O Connor Tweezer, Average number on two trials +7286 +207 
Word Building. Number of words produced 4:302 +136 
Kent-Shakow Form Boards—number of mistaken 


moves —:906 MED 
Penrose Pattern Perception Test. Number correct +:270 +042 
Maudsley Word Connection List. Neurotic score —+160 —*219 


Educational Status 1 131 
Mill Hill Vocabulary Test Her a 


1 bula -26 —-086 
Nursing Examination—theoretical Hed +152 
Nursing Examination—practical +°358 + +363 


ed ^j. 


manual dexterity test, which we have shown elsewhere to be a test 
of neuroticism as well as a test of ability, scores appear to be related 
to both types of criteria. It is also in line with our hypothesis that 
when the examination marks of the nurses are correlated with the 
two criteria, the theoretical examination is found to correlate 482 
with the ‘ability’ criterion, but only “152 with the ‘personal rela- 
tionships’ criterion, while the practical examination correlates 
equally well with both criteria (r = -358 and 363 respectively). 
The outstanding results of this experiment are (1) the relatively 
small role played in nursing efficiency by intelligence (particularly 
as measured by verbal tests—correlation of Vocabulary with the 
total rating was not significantly different from zero), and (2) the 
relatively important role played by temperamental factors, a5 
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measured by various tests of neuroticism. None of the zero-order 
correlations are very high, but in combination they predict effici- 
ency in nursing better than a battery of purely cognitive tests. 


(5) Sense of Humour and Popularity in Teachers 


So far in this chapter, we have discussed the position in factorial 
space of concepts such as employability, job satisfaction, job effici- 
ency and others having occupational reference. The same technique 
of dimensional analysis, of course, can be applied to concepts 
having no such reference, and as an example an experimental 
study of the position of ‘sense of humour’ with respect to intel- 
ligence and neuroticism will be quoted. This investigation, carried 
out by F. Loos (1951), made use of several types of test to measure 
five separate meanings of the term, ‘sense of humour’. 

(1) Sense of humour may be looked at from the point of view of 
appreciation; a person is considered to possess this trait ifhis appre- 
ciation of the relative funniness of jokes, films, events, etc. agrees 
with our own, or with that of the majority. (2) Again, a person 
may be considered to possess ‘sense of humour’ if he appreciates a 
large number of jokes, witticisms, etc., or laughs a great deal about 
many things, irrespective of the order of funniness into which he 
would put these jokes, events, etc. (3) Alternatively, sense of 
humour may be looked at from the point of view of creation; a 
person is considered to possess this trait if he is constantly making 
jokes, drawing attention to amusing features of the situation, or in 
other ways creating merriment. (4) A fourth method of looking at 
sense of humour would emphasize the ‘social stimulus’ quality of 
the person, and rely on ratings by others of his ‘sense of humour’. 
(5) Lastly, sense of humour may be defined in terms of self-ratings. 
These five.definitions may give rise to tests which are highly cor- 
related, but it is quite possible that they may point to quite un- 
related aspects of personality. А 

The following tests were included in the research to measure 
the five aspects of ‘sense of humour’. (1) A Limerick Ranking test, 
in which twelve limericks have to be ranked in order of funniness; 

€ score is the amount of agreement with the average ranking of 
the whole group. (2) Limerick Liking test, in which the subject has 
indicate how many of the limericks he considers funny; this 
number is his score. Both these tests were taken from an earlier 
Paper by Eysenck (1943). (3) With respect to the ‘creative humour’ 
Pect, two tests were included. In the first of these, captions had 
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to be written for cartoons from which the original icem e? 
been eliminated; in the second, social situations were out es dee 
the subjects who had to find an amusing ending for each situa em 
Scoring of these creative efforts was carried out by judges d E 
did not take part in the experiment. The validity of the we $ 
of twenty judges’ ratings, using a formula developed em 
(Eysenck, 1939), was found to be in the vicinity of :90, so e 
scores on these two tests are reasonably objective. (4) Social humo E 
rating. This consisted of an averaged rating of the subjects ee 
of humour, made by her colleagues. (5) In addition to these tests 
and ratings, each subject was asked to give a self-rating of her ae 
sense of humour. In addition, a ‘popularity rating’ was obtaine 
ach girl. Я 
I To "m the position of each girl on the intellectual dimension, 
Thurstone’s Primary Mental Abilities tests were given to the girls; 
we thus obtain scores on the space, verbal, reasoning, number, 
and word fluency factors. To assess neuroticism and other possible 
temperamental factors, a group Rorschach was given,! as well as 
the Word Connection List, and the Worries, Likes, and Dislikes 
tests mentioned earlier in the book. Also used was the Rosenzweig 
Picture Frustration test, scored for Impunitiveness and for Extra- 
punitiveness minus intropunitiveness, sth 
The subjects were 76 girls in a teachers’ training college, w1 
an average age of 19. Scores on the nineteen tests used were inter- 
correlated for the whole population, and a factorial analysis per 
formed. Four significant factors were extracted. Factors I and vi 
are plotted in Figure 36. The five Thurstone tests define factor 
as an intelligence factor; the four neuroticism tests (Word са 
nection List, worries, word likes and dislikes) define the secon 
factor as one of neuroticism, The results threw some interesting 
light on the relation of sense of humour to intelligence and ДЕШЕ, 
cism. Let us take the Limerick Ranking test first. Agreement wit 
the average here is indicative of lack of neuroticism, as might have 
been expected; the slight positive saturation of this test with intel- 
ligence is also not contrary to expectation. The Limerick Liking 
test is also correlated with neuroticism, in the sense that the more 
stable girls tend to like а largernumber of limericks. This test has 2° 
appreciable saturation with intelligence. We may conclude that 


1 The Rorschach was scored for neuroticism 
Individual correlations were calcula: 
various Rorschach categories; 


: od. 
‘by means of the sign’ per 
ted between sense of humour tests an 
these correlations were all insignificant. 


25. Needle-threading test. 


26. Close-up view of needle-threading test. 


27. Agility (water-carrying) test. 


28. Kretschmer abstraction test. 
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the more stable subjects like more limericks, and agree in their 
preferences more with the average judgment, than do the less 
stable girls. 
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As regards the two creative humour tests, it will be seen that 
they both have slight correlations with intelligence, as might have 
been expected. Their saturations with the neuroticism factor are 
small and opposite in sign to each other, so that we cannot posit any 
relation between neuroticism and creative humour as measured 


here. This lack of relation may be a mere artefact, due to the 
imperfection of the tests used. 


S.S.P. T 
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‘The social humour rating, and the subjective sense of humour 
rating, both show marked saturations with the m ge et 
in the expected direction, and fail to show correlations wit 
intelligence factor. It is interesting in this connection to see t e. 
the popularity rating is unrelated to either neuroticism or intel- 
ligence; in other words, an unstable girl, or an unintelligent one, is 
just as likely to be popular as a stable, or a bright one. But an 
unstable girl is not likely to be thought of as having a good sense 
of humour’, nor is she likely to consider herself as possessing this 
trait. This agreement between self-judgment and poor rating is 
surprising in view of the frequent claims made in the literature 
that few people admit to having a poor sense of humour; perhaps 
the imperative demand for this quality in the American culture is 
more pressing than it is in England! 

A third factor was clear] 
the social humour ratin 


was in the same directio 
Ranking score. As the 
namely the two Rose 
follows that in this ana 
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was designed to analyse. Clearly the results cannot be regarded as 
in any sense final; different sets of tests, different populations, 
different culture patterns might easily produce results differing 
widely from those reported. Nevertheless, it is only through com- 
parisons of different experiments, carried out in different con- 
ditions, that we can learn about the influence of those aspects 
of the experimental situation over which we have no control. At 
the very least, this research would seem to have confirmed the 
suspicion expressed by many writers that the concept of ‘sense of 
humour’ is not a unitary one, but that we are dealing rather with 
a multitude of independent aspects which must be quantified and 
studied separately. This research marks the beginning of a taxo- 
nomic study of the position of these various aspects within the 
total personality space. 


Chapter Eight 


THE ORGANIZATION OF PERSONALITY 


or explicitly a stand has been taken on а very important 
theoretical issue. In the field of personality research, as in 
the: more strictly experimental fields of perception and memei 
there are two fundamental ways in which we can order our thin : 
ing, and along which we can plan our experiments and construct 
our theories. The first of these is that of atomism or clementalism; 
the other is that of wholeness or gestalt—the organismic way. T e 
great majority of psychologists who deal with the concept 5 
personality either experimentally or clinically have adopted the 
organismic point of view; the present book, like its predecessor, 
takes an outspokenly atomistic, elementalistic point of view. Here 
and there, throughout this volume, we have taken up this argu- 
ment with respect to individual issues; now we must come to grips 
with it on a somewhat more fundamental level, ү 
First of all, let us see precisely what is meant by the ‘organismic 
approach in personality study. We cannot do better than to ST 
from a book which has put the wholistic point of view better an 


more clearly than any other, namely, the Assessment of Men (O.S.S. 
Staff, 1948). Apparently, 
h 


A LL through this book, the reader will have noted that implicitly 


tion systems, and made this the SN 
Underlying this feature of O.S.S- 


H 4. e 
calls for accurate Loses: е 
sses, whereas the organism! 


timations of total integrated 
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processes’. It follows from this that ‘for a short over-all assessment 
the interview is probably the best and only indispensable method 
we have’. In contrast, ‘the elementalistic method is abstract and 
unrealistic, since no attempt is made to reproduce the conditions 
under which the men will eventually perform. ... In adopting 
this method . . . the psychologist makes a radical subjective judg- 
ment at the very start by electing constituent processes, testing for 
these separately, and then adding the scores to arrive at a final 
rating. He does this even though he knows that in actual life the 
mind does not add sequences of elementary processes to produce 
results, but organizes them into effective forms.’ 

Clearly, the organismic point of view is a “enable one; there are 
no internal contradictions which would make the system of beliefs 
briefly presented here illogical and mutually contradictory. Nor 
can the elementaristic point of view be dismissed on logical grounds. 
It follows that we must turn to the experimental evidence to decide 
between these two contradictory systems. Such evidence may be 
sought at three levels: (1) The level of total assessment, as in the 
O.S.S. programme itself; (2) an intermediate level, as in the 
case of Rorschach test interpretation along statistical or intuitive 
lines; (3) a fundamental level, as in perceptual organization, for 
instance. 

At the level of total assessment, we have already quoted very 
briefly the Michigan study (Kelly, 1949, 1947; Kelly and Fiske, 
1950). ‘The large number of subjects studied, the very large number 
of techniques of measurement and assessment used, and the excel- 
lence of the experimental design would have made the findings 
from this programme outstanding, even if they had been less 
revolutionary in their import. From every point of view, the final 
results are a devastating comment on and criticism of the clinical 
methods of interviewing and projective testing current in most 
applied personality work. It was found that the most efficient 
clinical predictions in terms of both validity and economy of data 
are those based only on the matters contained in the credentials 
file and in the objective test profiles. The addition of autobio- 
graphical and projective test data appears to have contributed 
little or nothing to the validities of the assessment rating. Neither 
the initial nor the intensive interview made any apparent con- 
tribution. In fact, the predictions based on the credentials and 
objective tests are better than those made at the end of the pro- 
gramme on the basis of test procedures and observations! This 
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consistent trend would seem to be all the more significant in eg 
of the fact that assessment staff members tended to be аны 
of the opinion that the interview contributed most to their ‘unde E 
standing of the case’, P ч either the projective test or a 
i ? (Eysenck, 1951). 

raid nio (1950) pes out that ‘predictions based on 
individual projective tests as well as those based on an hero 
of data from all four projective tests yielded relatively low correla- 
tions with the rated criteria. Scores from a single objective test 
obtainable by mail at little cost predicted each of several criteria 
as well as all the clinical judgments made in the entire assessment 
programme.' They go on to say that ‘many who have seen our 
results have been disturbed by the findings regarding the validity 
for this selection problem of specific techniques, which are felt by 
many professional psychologists to have a high degree of face 
validity (or is it faith validity?). Thus, it was the firm conviction 
of the staff of the O.S.S. Assessment Programme that the global 
evaluation of a person permits much more accurate predictions of 
his future performance than can possibly be achieved by a more 
segmental approach. . . . Our own findings to date serve to raise 
doubts concerning the validity of this general proposition. . - - 
Although the unstructured interview is one of the most widely used 
tools in personnel selection, the writers know of no evidence in the 
literature to suggest that 
low validity, 
esteem with 
aspect of o 
relationshi 


of the organismic psyc 
clinical psychologists w 
and made the predicti 
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ismicists, and it may be added that such proof must be experi- 
mental in kind—semantic arguments from supposedly higher 
general principles are not sufficient to controvert empirical findings. 

At the second level, comparatively little research has been 
carried out; the small-scale experiment comparing statistical and 
intuitive diagnostic procedures with respect to the Rorschach test, 
which is outlined on page 163 of this book, may serve as an 
isolated example. This neglect of a very promising approach is 
rather odd; few psychologists have the facilities needed for a large- 
scale inquiry such as would be required at the highest level of 
complexity, but most research workers would find it very easy to 
set up an experiment at this level of complexity. Much useful 
information could be gained in this way regarding the efficacy of 
these two approaches; as far as our data are concerned the out- 
come of the experiment is again hostile to the organismic approach. 

It is at the lowest level of complexity, however, in connection 
with the processes of perception or memory, that gestalt writers 
have been able to support the organismic view most strongly, and 
it is at this level that their view has achieved its greatest success. 
Consequently, experimentation specially planned to test the rival 
hypotheses in this field also is of particular interest and import- 
ance. If the organismic hypothesis fails here too, then we may say 
with some degree of confidence that it is unlikely to stand up to 
careful experimentation in more complex fields. We turn next, 
therefore, to an experiment which superficially may seem to have 
little to do with the study of personality, but which throws much 
light on the principles underlying the construction of psychological 
models of organization. This experiment, reported by Granger 
(1950), deals with esthetic appreciation; it covers a field, there- 
fore, which saw the original gestalt hypotheses grow (as in the 
work of Ehrenfels), and which is still regarded as the strongest 
fortress into which to retire for a last stand by many organismicists. 

Granger took his stimuli from the Munsell Colour System 
(1921), which is organized around three dimensions: Value (more 
usually called lightness, i.e. the scale of greys from white to black); 
Chroma (more usually called saturation); and Hue (i.e. the special 
qualitative attribute which distinguishes red from blue, or green 
from yellow). Each dimension is intended to form a scale of per- 
ceptually equidistant steps. Three sets of tests were made up: 
(1) Hue tests, of which there were 24, made up in such a way as to 
hold value and chroma constant within each test. As far as possible, 
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all ten hues were sampled in each case at all levels of the colour 
solid. (2) Value tests, of which there were 24, made up in such a 
way as to hold hue and chroma constant. (3) Chroma tests, of 
which there were 11, made up in such a way as to hold hue and 
value constant. The coloured chips which constitute the stimuli 
for each of these tests were presented to 25 male and 25 female 
subjects, screened for colour-blindness by means of three standard 
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wavelength of the colours, in the sense that the hues of shorter 
wavelength are preferred to those of longer wavelength. 

We have given, very briefly, some of the facts regarding the 
zsthetic appreciation of the simplest possible stimulis, viz. a single 
colour. Can we use these results to forecast appreciation of colour 
combinations? Here we come against our first crucial problem. 
According to the atomistic or elementaristic type of analysis, we 
would expect the following considerations to determine prefer- 
ences for colour combinations: (1) Preferences for the individual 
colours contained in the combination. If we took combinations of 
two colours and had these ranked, then we should expect the 
resulting ranking to show substantial agreement with a ranking 
carried out simply on the basis of the summed preference ratings 
for the single colours making up each dyad. (2) Relations between 
the colours contained in the combination. Thus if we take the hue 
circle, we might find that liking for a given combination of two 
colours increases as the distance between them on the colour circle 
increases. This rule also could easily be translated into a numerical, 
‘additive’ rule which would specify exactly in each case which of 
two colour dyads should be preferred. (3) Specific and error factors. 
We are unlikely to find perfect test-retest correlations; indeed, 
reliabilities of simple preference tests are seldom above ·7 to 8. 

According to the gestalt hypothesis, these notions would be 
considered subject to criticism. Preferences for colour combina- 
tions are emergent qualities, essentially unlike the constituent qualities 
which determine preferences for single colours, and impossible to 
predict from any knowledge of single colour preferences, or from 
any quantitative relations obtaining between the colours. 1 

Here, then, we would appear to have an experimental model 
which may be expected to produce quite clear evidence for or 
against the holistic approach. Prediction of preferences for colour 
combinations is in principle possible on the basis of knowledge of 
(a) preferences of the single colours, and (5) knowledge of the 
objective relations obtaining between them along the dimensions 
of the colour solid; that is the bald statement of the atomistic 


! It is interesting to note that the first psychologist to maintain this gestalt 
view was the arch atomist Wundt (1910), who maintained that preferences for 
colour combinations would be determined by a *Totalgefühl', an emergent 
configuration with unique properties over and above those of the component 
stimuli. Külpe (1909) and Titchener (1912) held a contrary view, more in line 
with that advanced here. 
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position. Prediction based on these principles is impossible—what 
is involved are emergent qualities which cannot be reduced to 
known properties of the stimuli considered apart; that is the bald 
statement of the organismic position. We must now turn to the 
experimental evidence. 
The first demonstration relates to the question of relation 
between stimuli, i.e. the hypothesis that colour combinations are 
preferred in terms of the distance between the component colours 
on the colour circle. The correlation, when a standard colour was 
presented together with a variable colour, was found to vary from 
:93 to unity in different experiments, so that we may assume that 
the hypothesis is essentially correct. (In a similar experiment by 
Clarkson et al. (1950), correlations of :96 and -99 were observed, 
thus lending support to this generalization.)! Orders of preference 
for colour combinations correlated highly from subject to subject, 
for hues, values, and chroma. W coefficients average around ‘6 to 
77; thus showing marked agreement. Not only do individuals agree 
with respect to their rankings of the stimuli on each of the tests, but 
1t can also be shown that the hue tests correlate with each other 
(W = :96), the value tests correlate with each other (W = -46), and 
2 жеее eain with each other (W = -37). And lastly, 
aa he e : € subjects on the hue, value, and chroma tests 
ето the extent of W = бл, ie to an extent identical with 
stimuli were used. 
r cach subject on the single colour 
€ hues, values, and chromas tests), 
es on the combined colours rank- 
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ndicates very strongly that the laws 
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individual colours making up the combinations. Then correlations 
were run between the observed ranking of the colour combina- 
tions (O), the hypothetical ranking derived entirely from the pre- 
ference judgments for the component colours (C), and the hypo- 
thetical ranking derived entirely from the colour intervals of the 
components (I). The correlation between О and С is -631, and 
that between О and I is :554: C and I are of course uncorrelated 
(r = :004). When С and І are combined, they predict the final 
ranking to an extent indicated by a correlation of :84. When 
corrections for reliability are introduced it can be shown that the 
predictive accuracy of the combination of C and I is not far short 
of 100 per cent. Thus the final preference judgments of colour 
combinations are determined entirely by the elements entering 
into the combination, and the relations between these elements as 
determined by simple algebraic addition. It is difficult to see how 
the gestalt hypothesis could be invalidated more conclusively. 

We can, however, take one further step. Even colour combina- 
tions may be said to be at a relatively low level of complexity. 
What would happen if we correlated score on the colour ranking 
test with score on some much more complex test of art apprecia- 
tion, such as the Maitland Graves Design Judgment test (1948), in 
which judgments have to be made between good and poor designs 
of considerable complexity? This test, which has been shown to 
have considerable validity and good reliability, is entirely in black 
and white, so that colour plays no part in it. It is easy to sec that 
on the gestalt hypothesis no positive prediction could be made; 
here again those elusive emergent properties would prevent any 
application of knowledge gained from less complex stimuli. If we 
take our stand, however, on the demonstration of a general factor 
of esthetic judgment (Eysenck, 1940), a demonstration based 
entirely on an atomistic view, then we would predict a sizeable cor- 
relation between these variables. In fact, correlations of -59 and 
"73 were observed when the Maitland Graves test was correlated 
with results from the single colour test and the colour combination 
test. (It should be noted that these correlations are not due to 
differences in intelligence; only very small correlations were found 
between intelligence and the colour ranking tests.) Here again, 
then, the atomistic view is supported by the test results, and the 
gestalt hypothesis discredited. 

As in connection with previous demonstrations, we do not wish 
to imply that this particular study has once and for all settled the 
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theoretical conflict between associationism and gestalt. We have 
reported it in some detail because it is only by constantly bringing 
these complex theoretical issues into direct contact with experi- 
mental data that we can hope to arrive at a meaningful and 
scientifically worth-while conclusion. If those who follow the 
organismic line of argument would similarly subject their concepts 
and arguments to empirical testing then we might see a consider- 
able growth in the area of agreement between different schools of 
psychology, and an abatement in the purely semantic discussion 
which so often passes for theorizing. As in so many other areas of 
social psychology, ‘the same old concepts and words are tumbling 
around in the same old empty drum to the resounding echo of 
excessive verbalization’ (Eysenck, 1951d). 

How, then, it may be asked, does the atomistic psychologist 
view the problem of personality organization? We may perhaps 
attempt to answer this question by reference to an urgent psy- 
chiatric problem, namely, that of diagnosis. Reference has already 
been made in an earlier chapter to the conflicting views, and the 
often self-contradictory pronouncements, of psychiatrists and 
clinical psychologists. In their terminology, they regard the patients 
under their care as falling into one of a number of different classes 
(hysteria, schizophrenia, neurasthenia, etc.), a view which is only 
permissible if these different disorders are regarded as qualitatively 
different; yet in much of their writing and even more so in their 
practice they regard these different classes as blending into one 
another, as being mutually overlapping, as being in fact differ- 
entiated along quantitative lines. Clearly the ualitative and the 
quantitative aspects must be reconciled in wë way, but little 
gps is to be found in the text-books. ы 
Deeg а is implicit in the experiment 
the required number of di apters of this book: we must deier" 
measure them with dimensions, locate them accurately, ар 
Figure 37 may s а given degree of reliability and validity. 

Y Serve as a very rudimentary model of the kind of 


structure we have in mind. Using our experimentally demonstrate? 


three fac ici ws i 
cc factors of neuroticism, psychoticism, and extraversion-intrO" 


version as three axes of a co-ordinate system, we can now locate a 
given patient in terms of his exact position within this system 
Leaving out of account the extravert dimension for the moment, 
we can sec that the average person would lie in the centre of the 
diagram, at A; a strongly psychotic person, undifferentiated with 
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respect to neuroticism, would be located at P, while a strongly 
neurotic patient, undifferentiated with respect to psychoticism, 
would be located at N. A person suffering from both psychotic 
and neurotic disorders would be found at point P + N. In terms 
of this diagram, the question: ‘Is this person psychotic or neurotic?’ 
becomes as unreasonable as the question: ‘Is this patient intelligent 
or tall? ‘Two orthogonal vectors, like neuroticism and psycho- 
ticism, generate a plane on which the position of an individual has 
to be indicated by reference to both vectors; we can only describe an 
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Figure 37 


individual by giving both his I.Q. and his height, or by giving both 
his degree of neuroticism and of psychoticism. All positions on the 
plane thus generated are possible locations for a given individual, 
and it will be seen that mixed cases are far more likely than pure 
cases—we are more likely to find individuals in the plane of the 
diagram than on the ordinate or on the abscissa. This preponder- 
ance of mixed cases of course agrees well with clinical experience. 
Diagnosis on this showing should consist in the accurate determin- 
ations of an individual’s position on the plane, rather than, as is now 
usual, in a simple cither-or judgment’ (Eysenck, 19518). 

It is not hypothesized, of course, that the three dimensions 
dealt with in this book are the only ones into which personality can 
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along which measurement should take place. To 
сее жыны pools vu there is the case of intelligence, ar 
ly defined in terms of Thurstone’s second-order factor, whi d 
cuir: ately orthogonal to all the dimensions so far discussed. 
m cde other dimensions will no doubt be isolated ади! 
measured, and much prospecting has already been done rd prse 
(1946) into possible lines of progress. But regardless of the a ality 
number of independent dimensions which our picture of que Ge 
may require, it is clear that categorical diagnosis of the eit ge sc? 
kind are not warranted by the experimental findings; упа wë 
required is a separate assessment and measurement of each dime x 
sion in turn. It is not claimed that more than a beginning has bee 7 
made in this complex, time-consuming, and difficult proceeding; 
it is believed, however, that results to date are fully in agreement 
with the general model of personality on which our procedures 
have been predicated. | 
We have then a general outline into which to fit such facts E 
future research may discover. It is clearly a static outline, in [x 
same sense in which the periodic system of the elements is € 
however, movement can easily be introduced into this a 
when developmental facts can be experimentally controlled. Gs 
examples we may point to the studies on identical twins and on t 
after-effects of leucotomy summarized in earlier chapters. does 
The construction of such a general frame-work, of course, ie 
not release us from the obligation to investigate separately сё 
various factors which go to make up the structure, and to oe ] 
psychological hypotheses regarding their nature. Similarly, care 
experimental studies are called for 
known to have high saturations 
Such work, however, 


to investigate individual a 
for any of the factors postulated- 
must follow the establishment of the ma! 
dimensions of personality; it cannot precede it. Consequently, WG 
have little to add on these counts; it must be left to future work to 
fill in these lacune. f 
Even so, however, certain theoretical speculations may be 0 
some interest, and they are advanced here with the explicit warn" 
ing that much experimental work will be required before we can 
assess their value. These speculations relate particularly to the 
neuroticism factor—possibly because more is known empirically 
regarding this factor than about any other; they are based ОЛ 
certain obvious analogies between neurotic illness and such phys 
events as, for instance, the fracture of metals—analogies whic? 
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have passed into the very way in which we talk about neurotic 
‘breakdown’, being under ‘stress’, reaching ‘breaking-point’, or 
having a ‘brittle’ personality. This analogy can be pushed a good 
deal further, and as long as it is recognized as an analogy and 
nothing more than that, it may throw some interesting light on our 
explorations of those weaknesses in personality which give rise to 
neurotic break-downs. 

It may come as a shock to many psychologists not well versed 
in physical theory and knowledge that present-day knowledge of 
fractures of metals and their causes is far from satisfactory, and 
that there are various theories in the field none of which has as yet 
succeeded in providing direct experimental support for its con- 
tentions. Again, as in psychology, we have the historical and the 
ahistorical approach, as well as the controversy between the adher- 
ents of a ‘dynamic’ and those of a ‘static? method of analysis. A 
brief description of current physical thought on this subject may 
provide a frame-work for our subsequent discussion. 

‘The basic scientific problem of the general conditions which 
determine the ductile and brittle failure of metals is still largely 
unsolved’ (Sachs, 1948). Two main avenues of approach are 
being used to investigate these phenomena. The micro-mechanical 
method is being used to study the effect of various factors such as 
lattice imperfections, and the formation, reorientation, and growth 
of cracks. The phenomenonological method studies the general macro- 
scopic laws underlying fracture. ‘Eventually the two types of inves- 
tigations will be correlated to provide a single self-consistent theory 

‚ for fracturing of metals’ (Dorn, 1948). Most of our knowledge 
concerning the effect of stress on the fracture of metals has been 
obtained from the phenomenological approach. There is a direct 
complement here with psychological work, particularly with the 
distinction drawn between the molar and the molecular approach. 
The micro-mechanical method would find its analogue in detailed 
neurological work, or in the kind of work summarized by Selye 
under the revealing title of ‘Stress’ (1950). The phenomenological 
method would find its analogue in the molar approaches favoured 
by most psychologists nowadays. We may indicate this correspond- 
ence in the form of a Table: 


Physical methodology: Psychological methodology: 


(1) Micro-mechanical method. (1) Molecular method. 
(2) Phenomenological method. (2) Molar method. 
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It used to be assumed in physics that there were two general 
criteria for fracture: (1) Fracture occurs when a critical state of 
strain is achieved, and (2) fracture occurs when a critical state of 
stress is achieved. It is now accepted on the basis of recent investi- 
gations that the critical strain and critical stress laws for fracture 
are individually incorrect, and that we must follow Bridgman in 
postulating (3) that the stress level for fracture depends upon the 
entire stress and strain histories preceding fracture. Here we have 
another analogue with psychological theory, contrasting the his- 
torical with the constitutional point of view. And again the reso- 
lution is similar—a recognition of both elements, and an attempt 
to obtain quantitative estimates of the influence of both. 

A geometrical representation in many ways similar to that 
provided by factor analysis in psychology may render the position 
in the physical field a little clearer. It can be shown that ‘the state 
of stress in a homogeneous isotropic metal is completely defined by 
the three principal stresses. Therefore, the stress state at fracture 
can be represented by a stress surface in three-dimensional cartesian 
co-ordinate stress space as shown in Figure 38. Whenever the 
stresses are less than the values on the limiting surface, the metal 


T 


33 Т, 2 


М 


Figure 38.—Fracture Surface in Stress Space 
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will not fracture. But when the stress state reaches the limiting 
surface, fracturing will take place. Stressing along line OL, there- 
fore, will not lead to fracture until a state of stress represented by 
point F on the fracture surface is reached.”! 

It must, of course, be realized that this fracture surface is in no 
way to be regarded as rigid; its shape and size are determined by 
the previous stress or strain history of the metal, and alter with it. 
It resembles a flexible membrane whose distortion is dictated by 
the stress or strain history. For homogencous isotropic metals, only 
certain shapes of the stress surface are theoretically possible. The 
surface must be symmetrical about line OM which makes equal 
angles with the positive directions of the three principal stresses. 
Furthermore, the surface must be independent of the orientation 
of the X, Y, Z co-ordinate axes of the parts being stressed. There- 
fore, the function of the principal stresses representing the fracture 
surface of a homogeneous isotropic metal must be invariant relative 
to rotations of the co-ordinate axes in the past. 

This concept of ‘fracture surface’ has no equivalent in psy- 
chology, but potentially it would appear to have considerable 
heuristic value. Presumably axes having the properties mentioned 
above could be located through factorial studies, based on controlled 
Observation in the case of human subjects, and based on experi- 
ment in the case of animals. The total size of the fracture surface 
would presumably be identifiable with our ‘neuroticism’ factor. 

Our discussion so far may appear theoretical and remote 
because in actual fact the metals we work with are neither homo- 
geneous and isotropic, and certainly in the psychological field 
anisotropy and non-homogeneity are more prevalent. However, 
Science advances from the simple to the complex, and what has 
been said so far may serve as a first approximation. We must now 
iry to encompass certain complicating factors. 

The three axes along which we plotted the fracture surface are 
Strictly arbitrary in their location, although conventionally one 
would presumably orient them in relation to direction of the 
application of the stress. When the same treatment is applied to 
anisotropic material, certain positions of the axes become prefer- 


1 The metal under discussion has been taken as essentially homogeneous and 
isotropic because the detailed variations in stress from grain to grain or even 
from point to point in a single grain of polycristalline metal are not known at 
Present. Most studies on fracture in polycristalline metals have neglected the 
complications arising from the crystalline structure of the metals. 
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able; these can be determined by the differential reaction to stress 
of different parts of the material, by differing historical conditions 
of different parts of the material, or both. Similarly, it is the exist- 
ence of inhomogeneity in human behaviour which decides us to 
prefer one set of factor axes to another—although even then no 
‘absolute’ value attaches to such a position. 

One important change in our picture which must accompany 
the shift from isotropic to anisotropic material is the inclusion of 
time as an additional dimension. It has been known for a long 
time that brittle solids may support a static stress for long periods 
of time without the slightest evidence of approaching failure, which 
then occurs with great suddenness. Experimental evidence shows 
that there is a definite relation between the intensity of the stress 
to which the brittle solid is subjected, and its duration necessary to 
produce a fracture. It was found that ‘the duration of the stress 
required to obtain the failure decreased very sharply as the inten- 
sity of the applied stresses was increased’ (Poncelet, 1948). ` 

This phenomenon of 'static fatigue' cannot be reconciled with 
any of the classical theories of strength of brittle solids, which do 
not take the duration factor into consideration. The first steP 
towards a solution of this dilemma was taken by Inglis (1913), 
whose observation that scratches or cracks reduce the strength © 
hardened steel plates, through the considerable intensification © 
the applied stresses at the crack tips, was taken up by Griffith 
(1921, 1924) in his famous ‘crack’ theory. According to this view, 
the notorious weakness of brittle solids to tensile stresses is attribute 
to the existence of indefinite ‘flaws’ in the solids; these flaws he 
believed to be in essence submicroscopic cracks. He assumed the 
structure to be continuous and developed his views along thermo- 
dynamic lines.! "This new approach . . . introduced the factor 
time in the expression which determines the breaking stress · : ' 
of brittle solids, and gives an intelligible interpretation of the 
phenomenon of static fatigue’ (Poncelet, 1948). It also explains 
why the average stress required for the fracture of solids is many 
times smaller than that calculated from the forces acting betwee” 
the atoms. 

How is all this relevant to psychology? The application is SUI 


1 This creates a difficulty regarding the explanation of these flaws which 
can be sidestepped by assuming the structure of the solids to be particulate, ап 


composed of ions. In this way the Phenomenon is brought into the field o 
general statistical mechanics, 
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prisingly direct. Let us recall for a moment Thomson's so-called 
'sampling theory' of intelligence. He explains the observed pattern 
of intercorrelations between intelligence tests in terms of bonds. 
‘Each test calls upon a sample of the bonds which the mind can form 

- . some of these bonds are common to two tests and cause their 
correlation.” While not venturing to suggest in detail what the 
bonds might be, Thomson believes that ‘they are fairly certainly 
associated with the neurones or nerve cells of our brains, of which 
there are approximately one hundred thousand million in each 
normal brain'. We may easily consider a theoretical equivalent 
to Thomson’s cognitive hypothesis in the affective and conative 
sphere by postulating that the observed pattern of intercorrelations 
between neuroticism tests can be explained in terms of faults. Each 
test calls upon a sample of the faults which are inherent in the mind 
. .. some of these faults are common to two tests and cause their 
correlation. No neurological locus will be suggested for these hypo- 
thetical faults, although it would not be difficult to find many 
possible claimants, particularly in the autonomic sphere. Here, 
then, we have a concept directly related to our previous discussion, 
because clearly these hypothetical faults are strictly analogous to 
Griffith's submicroscopic cracks. 

Certain other similarities may be stressed. One of these is the 
almost complete failure of classical theories to predict empirical 
results. There is little cause to document this statement with respect 
to psychology; in the physical field we may quote Slater's state- 
ment that *we have been upset in modern times that materials so 
eminently **ductile" as mild steel, as revealed by our old-fashioned 
tensile test, have proved so undeniably and catastrophically brittle 
on many momentous occasions’ (1948). Indeed, the failure of 
classical theory is perhaps more noticeable in the physical field 
because nature calls the scientist’s bluff more clearly and definitely 
there than it does in the mental field. The ship sinks; the bridge 
breaks; the steel tower topples—these are events which cannot be 
explained away. But the patient who does not improve under 
psychoanalytic therapy, or who does not react to E.C.T.: the pre- 
diction of combat success or academic achievement that is not 
borne out in fact; the clinical diagnosis that is refuted by later 
developments—these can be argued away in a facile, semantic 
fashion. Physics lives in the world of the reality principle; psy- 
chology and psychiatry have not altogether emerged from the 
world of the pleasure principle. 
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Yet some at least of the failures of the psychologist must be E 
at the door of those who determine the problems which he is ^ o 
t, and the conditions under which he shall attempt to solve t e . 
Tt 15 quite customary to ask the psychologist to predict success а 
group of people without specifying the exact nature of the situa E 
in which success or failure will be decided. Thus the subjects w Й 
passed successfully through the О.5.5. assessment Gem 
might be detailed to do one of a large number of very dissimi = 
jobs, ranging all the way from putting on propaganda shows 1 e 
oriental countries to being dropped as secret agents into CC 
occupied territory. This situation would find an analogue in t : 
physicist’s being asked which of a large number of metal bars = 
likely to fracture, but without his being given any information 
to the precise stress which would be applied to each of these rg 
His ‘prediction’, if indeed he agreed to furnish one under I 
impossible conditions, would amount to little more than a ES 
ment of the present strength of the various bars. The success of | A 
prediction would depend far more on the conditions to уйне, 
these specimens would be exposed later, than on the accuracy 9 
his measurement. do 
Again, while the physicist's failures are spectacular, they 
furnish a criterion which does not admit of argument. The P 
chologist's criterion is only too often unreliable, illogical, ап 
determined by extraneous considerations; low predictive per 
is more often due to faults in the criterion than to faults in the 
predicting variables. What is worse, administrative action so? f 
interferes with his attempts to establish proper controls, so that " 
becomes quite impossible to evaluate properly the results of his pie 
cedures. In comparing forecasting accuracy of physical and psycho 
logical tests, these considerations must always be borne in mine: 


Similar considerations apply to the situation in which the psy“ 
chologist is called in to ‘a, 


experimental investig 
to the particular situ 
is seldom given the o 
although in his cas 
instead, to advise o 
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basis of such very limited data as may be available—indeed, he must 
count himself lucky if there are any data at all. Until those who 
pose the questions gain some insight into the methods of science, 
and learn to provide suitable conditions for an experimental deter- 
mination of the answer, psychology will continue to operate at a 
low level of scientific respectability. 

Given the existing external circumstances, however, the psy- 
chologist can make a maximum contribution by (a) isolating the 
main dimensions of personality, and (b) improving the accuracy 
with which they can be measured. It is with these two aims in view 
that the experiments described in this book have been carried out. 
To what extent may they be said to have been achieved? The 
isolation of three main dimensions—neuroticism, psychoticism, 
extraversion—may be said to have been accomplished at a reason- 
ably high level of confidence, although it would seem most desir- 
able that all of this work should be repeated independently at 
other institutions; it is unfortunate that psychology has not adopted 
the practice widespread in the physical sciences where new claims 
are routinely tested by several institutions. Only in this way can we 
hope to build up a stock of verified trustworthy facts and principles; 
the single experiment may be of outstanding interest, but it 
can never be definitive. At the same time this procedure ensures 
that one of the most important of scientific requirements is met, 
namely reproducability of experimental conditions. Identifiability 
of groups, of sampling, of experimental procedure; repeatability of 
diagnoses, of tests, of treatments—all these must be ensured if an 
experiment is to be accepted as being of scientific value. And there 
is no better proof of reproducability of conditions than reproduc- 
tion of results, although the inverse of this relationship does not of 
course hold. Subject, then, to such repetition we may claim to have 
made a contribution to our first task. | | 

Regarding the actual measurement of these dimensions, we 
have shown in the text that neuroticism can be measured with a 
reliability in excess of :85 and a validity of about Do: these figures 
are minimum rather than maximum estimates, and are certainly 
encouraging. No exact figures can be given for the other two 
dimensions, as less work has been done in connection with them; 
there is no reason to believe that measurement there should be 
inherently less exact or less reliable. There are, however, certain 
warnings which long experience with work of this kind makes 
desirable in order to avoid disappointment. 
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In the first place, while on most of the groups pes жо 
the same tests are found to wii cda he is uh Él 
ici — uite com: 
a „жоо test will fall out of line. Some 
particular group one p toned сеа the 
examples of this tendency have been men tone n 
t. Thus for middle-class groups the Crown Word Conne t 
List has a high saturation with neuroticism, and low i 
with intelligence; for a group of unskilled workers studied by em 
(1951) this test was found, however, to have only relative y si 
saturations with neuroticism, but high saturations with intelligen A. 
This change can of course be explained in terms of different stan 
ards of literacy, nevertheless it indicates that results from one gr oup 
cannot immediately be extended to other groups differing "p 
the original one with respect to intelligence, social status, ог A 
other important ways. Again, while for groups of normal pe 
ligence the persistence test usually has good saturations ke? 
neurotics, it has been found quite consistently (Brady, 194 e 
O’Connor, 1951) that in mental defectives there is hardly any cor- 
relation between persistence and neuroticism. No obvious expla. 
tion of this difference Suggests itself, and a special b wa 
would be required to study this discrepancy. Indeed, it is Ko 
likely that a great deal may be learned about the nature of t 
factors involved by the careful study of such discrepant results. Р 
А ѕесопа difficulty which should not be underrated is relate 
to the problem of test standardization. We have ample SCH 
ight changes in the test may produce profound change 


us changing the method of recording sway on the 
body sway test, or chan 


‘sway forward? is made, 
results obtained. Whethe 
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trying to measure are those somewhat intangible things—ego in- 
volvements, motivation, strength of drives—then we cannot in 
any simple way control the physical properties of our stimuli and 
assume equivalence. Here again, differences in results from one 
experiment to another may at first be disheartening; if they are 
regarded less as experimental failures, but rather as leads to definite 
hypotheses, and stimuli to further experimentation devoted to a 
solution of the particular problem created by these inconsistencies, 
they may actually be of considerable use and importance. 

A third difficulty is connected with the one just mentioned. 
From many points of view, it would be desirable to keep tests 
identical from one experiment to another, so that cross-comparisons 
can be made. But if thaf is done we miss our opportunity to improve 
existing tests, and make use of such knowledge as we may have 
acquired during the course of our experiments. These two require- 
ments have to be carefully balanced, and a decision is not always 
easy. Premature standardization may serve merely to crystallize 
poor and ineffective practices; constant changes may make all 
useful comparisons impossible. In the past we have laid greater 
emphasis on improvement and change; in the future we shall lay 
greater emphasis on standardization. But this problem cannot be 
settled by any facile generalization, and it is likely to arise when- 
ever a particular experiment is being devised. | 

This point is closely related to the question of the amount of 
stress which the experimenter is to lay on the main two aspects of 
his work. He may either use many rough-and-ready tests, in order 


‚ to isolate the main dimensions along which he intends to work; or 


he may carry out careful and detailed experimentation on a small 
number of tests, in the hope of refining these sufficiently to derive 
meaningful scores from them. An example may make clear the 
difference between these two approaches. We have used in our 
work raw scores on tests such as the Body Sway Suggestibility test, 
or the fluency test, and shown that these tests correlate significantly 
with the neuroticism factor. While such findings are interesting 
and important, they must nevertheless leave the investigator some- 
what dissatisfied. Performance on even such relatively simple tests 
is certainly highly complex, and it is difficult to interpret findings 
Psychologically. Thus Bousfield and Sedgewick (1944) made a 
detailed study of fluency tests, and found that curves of output 
could be described by an exponential equation: N = C(1 — 290), 
where N is the number of words written up to a given time, /; C is 
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imit that the curve approaches asymptotically, pre- 
e Жайкы with the total supply of the kind of = 
for by the instructions, while е is the base of the na oe 
E h and m measures the rate of depletion of the supply o 
нн here writers have also shown that the mathematically 
deed constants have psychological as well as mathematical 
E When their subjects were given preliminary pem à 
naming automobiles the rate constant (m) increased, but no Gen 
supply of names constant (C). In another experiment, Е 
& Sedgewick (1944) found that the theoretical supply of plea de 
words was considerably larger than the supply of unpleasant wor! en 
but that the rate constant was the same. These findings =e 
psychological sense, and suggest that this theory can be applie e: 
individual differences in fluency. The number of words jr 
a given time is of course the raw score conventionally used as 4 
measure of an individual's fluency; C and m are analytical ч 
derived scores which in conjunction with each other determine the 
raw scores. ine 
If we accept this theory, we see immediately that intercorre E 
tions of raw Scores, factor analyses of such intercorrelations, an 
correlations of raw scores with outside criteria, all cease to € 
any interpretable meaning. Observed correlations may be due а 
the influence of the С constant, the m constant, or any combinatio e 
of the two; they would fluctuate in an unpredictable nee 
according to such variables as type of material selected, or ee 
of test. (Clearly the raw score on a short test depends more on M; 


i ores 
) In other words, analysis of raw scor 


lysis of this test (Eysenck, 1947) iue 
€ factors determine the raw score: the 
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to correlate neuroticism with ‘ability to resist suggestion’. However, 
as we cannot at the moment measure ‘aptitude’ separately, and 
thus derive a score for ‘ability to resist’ directly, we must remain 
content with the raw score, which will be considerably attenuated 
by the influence of ‘aptitude’. Work is proceeding at the moment 
to obtain separate measurements for the two parts of the ratio 
separately, but whatever the result of these endeavours, it must be 
clear that the raw score, in spite of its apparent simplicity, is in 
reality quite a complex resultant of several unrelated forces. 
What is true of these two tests is presumably true, to a greater 
or lesser degree, of all our tests. Although superficially simple, yet 
in reality they are quite complex, and the raw score may bear no 
very constant or high relation to any of the forces which determine 
it. This fact may explain the comparatively low correlations which 
are usually found between tests of neuroticism; we are correlating 
raw scores, each of which is the resultant of several forces of which 
only one may be relevant to our analysis. If we could carry our 
analysis further, and obtain direct measures of these relevant 
forces, the attenuation of our correlations would be much reduced, 
and presumably the size of the correlation would be increased. 
At the same time we might in this way obtain evidence regard- 
ing the contradictory findings which characterize so much of the 
literature on personality testing. Let us assume, for instance, that 
experimenter A finds that a test of fluency has a high saturation for 
intelligence, but a low saturation for neuroticism, while B reports 
that his test of fluency has low saturation for intelligence, but high 
saturation for neuroticism. These results obtained from the raw 
scores appear contradictory and inexplicable. Now let us assume, 
as is quite likely, that the ‘total fund of relevant words'—C—is 
highly correlated with intelligence while the ‘rate of production’ — 
m—is highly correlated with neuroticism. A short test would depend 
more on m than on C, and would thus be a test of neuroticism more 
than of intelligence. A rather longer test would depend more on G 
than on m, and would thus be a test of intelligence rather than one 
of neuroticism. Thus by simply varying the length of the test, 
experimenter B might obtain superficially discrepant results from 
A. By working with analytical scores, both experimenters should 
find that C had high saturations on intelligence, m on neuroticism, 
regardless of the length of test used. Similar considerations, of 
Course, apply to other variables, such as level of education, age, 
type of stimulus material, or sex; superficially discrepant results 
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obtained from raw scores should be reconcilable in terms of analytic 
scores. 

The conclusion might be drawn from this discussion that the 
most urgent task in the psychology of personality is the detailed 
analysis of individual tests and techniques along these lines. This 
conclusion would, in our view, be as false as the opposite one that 
the isolation of the main dimensions must precede such detailed 
analysis. In our view, both procedures must go hand in hand; 
promising tests are discovered in terms of their correlation with 
neuroticism, or psychoticism, or extraversion; they must then be 
analysed in detail, purified, and the new analytical scores used in 
order to determine with even greater accuracy the relevant factor, 
It is this interplay of statistical and experimental procedures which 
in our view will lead to important developments in the future. 
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